2015-05-05 17:58:48

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 000/208] big x86 FPU code rewrite

[Second part of the series - Gmail didn't like me sending so many mails.]

Over the past 10 years the x86 FPU has organically grown into
somewhat of a spaghetti monster that few (if any) kernel
developers understand and which code few people enjoy to hack.

Many people suggested over the years that it needs a major cleanup,
and some time ago I went "what the heck" and started doing it step
by step to see where it leads - it cannot be that hard!

Three weeks and 200+ patches later I think I have to admit that I
seriously underestimated the magnitude of the project! ;-)

This work in progress series is large, but it I think makes the
code maintainable and hackable again. It's pretty complete, as
per the 9 high level goals laid out further below. Individual
patches are all finegrained, so should be easy to review - Boris
Petkov already reviewed most of the patches so they are not
entirely raw.

Individual patches have been tested heavily for bisectability, they
were both build and boot on a relatively wide range of x86 hardware
that I have access to. But nevertheless the changes are pretty
invasive, so I'd expect there to be test failures.

This is the only time I intend to post them to lkml in their entirety,
to not spam lkml too much. (Future additions will be posted as delta
series.)

I'd like to ask interested people to test this tree, and to comment
on the patches. The changes can be found in the following Git tree:

git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git tmp.fpu

(The tree might be rebased, depending on feedback.)

Here are the main themes that motivated most of the changes:

1)

I collected all FPU code into arch/x86/kernel/fpu/*.c and split it
all up into the following, topically organized source code files:

-rw-rw-r-- 1 mingo mingo 1423 May 5 16:36 arch/x86/kernel/fpu/bugs.c
-rw-rw-r-- 1 mingo mingo 12206 May 5 16:36 arch/x86/kernel/fpu/core.c
-rw-rw-r-- 1 mingo mingo 7342 May 5 16:36 arch/x86/kernel/fpu/init.c
-rw-rw-r-- 1 mingo mingo 10909 May 5 16:36 arch/x86/kernel/fpu/measure.c
-rw-rw-r-- 1 mingo mingo 9012 May 5 16:36 arch/x86/kernel/fpu/regset.c
-rw-rw-r-- 1 mingo mingo 11188 May 5 16:36 arch/x86/kernel/fpu/signal.c
-rw-rw-r-- 1 mingo mingo 10140 May 5 16:36 arch/x86/kernel/fpu/xstate.c

Similarly I've collected and split up all FPU related header files, and
organized them topically:

-rw-rw-r-- 1 mingo mingo 1690 May 5 16:35 arch/x86/include/asm/fpu/api.h
-rw-rw-r-- 1 mingo mingo 12937 May 5 16:36 arch/x86/include/asm/fpu/internal.h
-rw-rw-r-- 1 mingo mingo 278 May 5 16:36 arch/x86/include/asm/fpu/measure.h
-rw-rw-r-- 1 mingo mingo 596 May 5 16:35 arch/x86/include/asm/fpu/regset.h
-rw-rw-r-- 1 mingo mingo 1013 May 5 16:35 arch/x86/include/asm/fpu/signal.h
-rw-rw-r-- 1 mingo mingo 8137 May 5 16:36 arch/x86/include/asm/fpu/types.h
-rw-rw-r-- 1 mingo mingo 5691 May 5 16:36 arch/x86/include/asm/fpu/xstate.h

<fpu/api.h> is the only 'public' API left, used in various drivers.

I decoupled drivers and non-FPU x86 code from various FPU internals.

2)

I renamed various internal data types, APIs and helpers, and organized its
support functions accordingly.

For example, all functions that deal with copying FPU registers in and
out of the FPU, are now named consistently:

copy_fxregs_to_kernel() # was: fpu_fxsave()
copy_xregs_to_kernel() # was: xsave_state()

copy_kernel_to_fregs() # was: frstor_checking()
copy_kernel_to_fxregs() # was: fxrstor_checking()
copy_kernel_to_xregs() # was: fpu_xrstor_checking()
copy_kernel_to_xregs_booting() # was: xrstor_state_booting()

copy_fregs_to_user() # was: fsave_user()
copy_fxregs_to_user() # was: fxsave_user()
copy_xregs_to_user() # was: xsave_user()

copy_user_to_fregs() # was: frstor_user()
copy_user_to_fxregs() # was: fxrstor_user()
copy_user_to_xregs() # was: xrestore_user()
copy_user_to_fpregs_zeroing() # was: restore_user_xstate()

'xregs' stands for registers supported by XSAVE
'fxregs' stands for registers supported by FXSAVE
'fregs' stands for registers supported by FSAVE
'fpregs' stands for generic FPU registers.

Similarly, the high level FPU functions got reorganized as well:

extern void fpu__activate_curr(struct fpu *fpu);
extern void fpu__activate_stopped(struct fpu *fpu);
extern void fpu__save(struct fpu *fpu);
extern void fpu__restore(struct fpu *fpu);
extern int fpu__restore_sig(void __user *buf, int ia32_frame);
extern void fpu__drop(struct fpu *fpu);
extern int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu);
extern void fpu__clear(struct fpu *fpu);
extern int fpu__exception_code(struct fpu *fpu, int trap_nr);

Those functions that used to take a task_struct argument now take
the more limited 'struct fpu' argument, and their naming is consistent
and logical as well.

Likewise, the FP state data types are now consistently named as well:

struct fregs_state;
struct fxregs_state;
struct swregs_state;
struct xregs_state;

union fpregs_state;

3)

Various core data types got streamlined around four byte flags in 'struct fpu':

fpu->fpstate_active # was: tsk->flags & PF_USED_MATH
fpu->fpregs_active # was: fpu->has_fpu
fpu->last_cpu
fpu->counter

which now fit into a single word.

4)

task->thread.fpu->state got embedded again, as task->thread.fpu.state. This
eliminated a lot of awkward late dynamic memory allocation of FPU state
and the problematic handling of failures.

Note that while the allocation is static right now, this is a WIP interim
state: we can still do dynamic allocation of FPU state, by moving the FPU
state last in task_struct and then allocating task_struct accordingly.

5)

The amazingly convoluted init dependencies got sorted out, into two
cleanly separated families of initialization functions: the
fpu__init_system_*() functions, and the fpu__init_cpu_*() functions.

This allowed the removal of various __init annotation hacks and
obscure boot time checks.

6)

Decoupled the FPU core from the save code. xsave.c and xsave.h got
shrunk quite a bit, and it now hosts only XSAVE/etc. related
functionality, not generic FPU handling functions.

7)

Added a ton of comments explaining how things works and why, hopefully
making this code accessible to everyone interested.

8)

Added FPU debugging code (CONFIG_X86_DEBUG_FPU=y) and added an FPU hw
benchmarking subsystem (CONFIG_X86_DEBUG_FPU_MEASUREMENTS=y), which
performs boot time measurements like:

x86/fpu:##################################################################
x86/fpu: Running FPU performance measurement suite (cache hot):
x86/fpu: Cost of: null : 108 cycles
x86/fpu:######## CPU instructions: ############################
x86/fpu: Cost of: NOP insn : 0 cycles
x86/fpu: Cost of: RDTSC insn : 12 cycles
x86/fpu: Cost of: RDMSR insn : 100 cycles
x86/fpu: Cost of: WRMSR insn : 396 cycles
x86/fpu: Cost of: CLI insn same-IF : 0 cycles
x86/fpu: Cost of: CLI insn flip-IF : 0 cycles
x86/fpu: Cost of: STI insn same-IF : 0 cycles
x86/fpu: Cost of: STI insn flip-IF : 0 cycles
x86/fpu: Cost of: PUSHF insn : 0 cycles
x86/fpu: Cost of: POPF insn same-IF : 20 cycles
x86/fpu: Cost of: POPF insn flip-IF : 28 cycles
x86/fpu:######## IRQ save/restore APIs: ############################
x86/fpu: Cost of: local_irq_save() fn : 20 cycles
x86/fpu: Cost of: local_irq_restore() fn same-IF : 24 cycles
x86/fpu: Cost of: local_irq_restore() fn flip-IF : 28 cycles
x86/fpu: Cost of: irq_save()+restore() fn same-IF : 48 cycles
x86/fpu: Cost of: irq_save()+restore() fn flip-IF : 48 cycles
x86/fpu:######## locking APIs: ############################
x86/fpu: Cost of: smp_mb() fn : 40 cycles
x86/fpu: Cost of: cpu_relax() fn : 8 cycles
x86/fpu: Cost of: spin_lock()+unlock() fn : 64 cycles
x86/fpu: Cost of: read_lock()+unlock() fn : 76 cycles
x86/fpu: Cost of: write_lock()+unlock() fn : 52 cycles
x86/fpu: Cost of: rcu_read_lock()+unlock() fn : 16 cycles
x86/fpu: Cost of: preempt_disable()+enable() fn : 20 cycles
x86/fpu: Cost of: mutex_lock()+unlock() fn : 56 cycles
x86/fpu:######## MM instructions: ############################
x86/fpu: Cost of: __flush_tlb() fn : 132 cycles
x86/fpu: Cost of: __flush_tlb_global() fn : 920 cycles
x86/fpu: Cost of: __flush_tlb_one() fn : 288 cycles
x86/fpu: Cost of: __flush_tlb_range() fn : 412 cycles
x86/fpu:######## FPU instructions: ############################
x86/fpu: Cost of: CR0 read : 4 cycles
x86/fpu: Cost of: CR0 write : 208 cycles
x86/fpu: Cost of: CR0::TS fault : 1156 cycles
x86/fpu: Cost of: FNINIT insn : 76 cycles
x86/fpu: Cost of: FWAIT insn : 0 cycles
x86/fpu: Cost of: FSAVE insn : 168 cycles
x86/fpu: Cost of: FRSTOR insn : 160 cycles
x86/fpu: Cost of: FXSAVE insn : 84 cycles
x86/fpu: Cost of: FXRSTOR insn : 44 cycles
x86/fpu: Cost of: FXRSTOR fault : 688 cycles
x86/fpu: Cost of: XSAVE insn : 104 cycles
x86/fpu: Cost of: XRSTOR insn : 80 cycles
x86/fpu: Cost of: XRSTOR fault : 884 cycles
x86/fpu:##################################################################

Based on such measurements we'll be able to do performance tuning,
set default policies and do optimizations in a more informed fashion,
as the speed of various x86 hardware varies a lot.

9)

Reworked many ancient inlining and uninlining decisions based on
modern principles.


Any feedback is welcome!

Thanks,

Ingo

=====
Ingo Molnar (208):
x86/fpu: Rename unlazy_fpu() to fpu__save()
x86/fpu: Add comments to fpu__save() and restrict its export
x86/fpu: Add debugging check to fpu__save()
x86/fpu: Rename fpu_detect() to fpu__detect()
x86/fpu: Remove stale init_fpu() prototype
x86/fpu: Split an fpstate_alloc_init() function out of init_fpu()
x86/fpu: Make init_fpu() static
x86/fpu: Rename init_fpu() to fpu__unlazy_stopped() and add debugging check
x86/fpu: Optimize fpu__unlazy_stopped()
x86/fpu: Simplify fpu__unlazy_stopped()
x86/fpu: Remove fpu_allocated()
x86/fpu: Move fpu_alloc() out of line
x86/fpu: Rename fpu_alloc() to fpstate_alloc()
x86/fpu: Rename fpu_free() to fpstate_free()
x86/fpu: Rename fpu_finit() to fpstate_init()
x86/fpu: Rename fpu_init() to fpu__cpu_init()
x86/fpu: Rename init_thread_xstate() to fpstate_xstate_init_size()
x86/fpu: Move thread_info::fpu_counter into thread_info::fpu.counter
x86/fpu: Improve the comment for the fpu::counter field
x86/fpu: Move FPU data structures to asm/fpu_types.h
x86/fpu: Clean up asm/fpu/types.h
x86/fpu: Move i387.c and xsave.c to arch/x86/kernel/fpu/
x86/fpu: Fix header file dependencies of fpu-internal.h
x86/fpu: Split out the boot time FPU init code into fpu/init.c
x86/fpu: Remove unnecessary includes from core.c
x86/fpu: Move the no_387 handling and FPU detection code into init.c
x86/fpu: Remove the free_thread_xstate() complication
x86/fpu: Factor out fpu__flush_thread() from flush_thread()
x86/fpu: Move math_state_restore() to fpu/core.c
x86/fpu: Rename math_state_restore() to fpu__restore()
x86/fpu: Factor out the FPU bug detection code into fpu__init_check_bugs()
x86/fpu: Simplify the xsave_state*() methods
x86/fpu: Remove fpu_xsave()
x86/fpu: Move task_xstate_cachep handling to core.c
x86/fpu: Factor out fpu__copy()
x86/fpu: Uninline fpstate_free() and move it next to the allocation function
x86/fpu: Make task_xstate_cachep static
x86/fpu: Make kernel_fpu_disable/enable() static
x86/fpu: Add debug check to kernel_fpu_disable()
x86/fpu: Add kernel_fpu_disabled()
x86/fpu: Remove __save_init_fpu()
x86/fpu: Move fpu_copy() to fpu/core.c
x86/fpu: Add debugging check to fpu_copy()
x86/fpu: Print out whether we are doing lazy/eager FPU context switches
x86/fpu: Eliminate the __thread_has_fpu() wrapper
x86/fpu: Change __thread_clear_has_fpu() to 'struct fpu' parameter
x86/fpu: Move 'PER_CPU(fpu_owner_task)' to fpu/core.c
x86/fpu: Change fpu_owner_task to fpu_fpregs_owner_ctx
x86/fpu: Remove 'struct task_struct' usage from __thread_set_has_fpu()
x86/fpu: Remove 'struct task_struct' usage from __thread_fpu_end()
x86/fpu: Remove 'struct task_struct' usage from __thread_fpu_begin()
x86/fpu: Open code PF_USED_MATH usages
x86/fpu: Document fpu__unlazy_stopped()
x86/fpu: Get rid of PF_USED_MATH usage, convert it to fpu->fpstate_active
x86/fpu: Remove 'struct task_struct' usage from drop_fpu()
x86/fpu: Remove task_disable_lazy_fpu_restore()
x86/fpu: Use 'struct fpu' in fpu_lazy_restore()
x86/fpu: Use 'struct fpu' in restore_fpu_checking()
x86/fpu: Use 'struct fpu' in fpu_reset_state()
x86/fpu: Use 'struct fpu' in switch_fpu_prepare()
x86/fpu: Use 'struct fpu' in switch_fpu_finish()
x86/fpu: Move __save_fpu() into fpu/core.c
x86/fpu: Use 'struct fpu' in __fpu_save()
x86/fpu: Use 'struct fpu' in fpu__save()
x86/fpu: Use 'struct fpu' in fpu_copy()
x86/fpu: Use 'struct fpu' in fpu__copy()
x86/fpu: Use 'struct fpu' in fpstate_alloc_init()
x86/fpu: Use 'struct fpu' in fpu__unlazy_stopped()
x86/fpu: Rename fpu__flush_thread() to fpu__clear()
x86/fpu: Clean up fpu__clear() a bit
x86/fpu: Rename i387.h to fpu/api.h
x86/fpu: Move xsave.h to fpu/xsave.h
x86/fpu: Rename fpu-internal.h to fpu/internal.h
x86/fpu: Move MXCSR_DEFAULT to fpu/internal.h
x86/fpu: Remove xsave_init() __init obfuscation
x86/fpu: Remove assembly guard from asm/fpu/api.h
x86/fpu: Improve FPU detection kernel messages
x86/fpu: Print supported xstate features in human readable way
x86/fpu: Rename 'pcntxt_mask' to 'xfeatures_mask'
x86/fpu: Rename 'xstate_features' to 'xfeatures_nr'
x86/fpu: Move XCR0 manipulation to the FPU code proper
x86/fpu: Clean up regset functions
x86/fpu: Rename 'xsave_hdr' to 'header'
x86/fpu: Rename xsave.header::xstate_bv to 'xfeatures'
x86/fpu: Clean up and fix MXCSR handling
x86/fpu: Rename regset FPU register accessors
x86/fpu: Explain the AVX register layout in the xsave area
x86/fpu: Improve the __sanitize_i387_state() documentation
x86/fpu: Rename fpu->has_fpu to fpu->fpregs_active
x86/fpu: Rename __thread_set_has_fpu() to __fpregs_activate()
x86/fpu: Rename __thread_clear_has_fpu() to __fpregs_deactivate()
x86/fpu: Rename __thread_fpu_begin() to fpregs_activate()
x86/fpu: Rename __thread_fpu_end() to fpregs_deactivate()
x86/fpu: Remove fpstate_xstate_init_size() boot quirk
x86/fpu: Remove xsave_init() bootmem allocations
x86/fpu: Make setup_init_fpu_buf() run-once explicitly
x86/fpu: Remove 'init_xstate_buf' bootmem allocation
x86/fpu: Split fpu__cpu_init() into early-boot and cpu-boot parts
x86/fpu: Make the system/cpu init distinction clear in the xstate code as well
x86/fpu: Move CPU capability check into fpu__init_cpu_xstate()
x86/fpu: Move legacy check to fpu__init_system_xstate()
x86/fpu: Propagate once per boot quirk into fpu__init_system_xstate()
x86/fpu: Remove xsave_init()
x86/fpu: Do fpu__init_system_xstate only from fpu__init_system()
x86/fpu: Set up the legacy FPU init image from fpu__init_system()
x86/fpu: Remove setup_init_fpu_buf() call from eager_fpu_init()
x86/fpu: Move all eager-fpu setup code to eager_fpu_init()
x86/fpu: Move eager_fpu_init() to fpu/init.c
x86/fpu: Clean up eager_fpu_init() and rename it to fpu__ctx_switch_init()
x86/fpu: Split fpu__ctx_switch_init() into _cpu() and _system() portions
x86/fpu: Do CLTS fpu__init_system()
x86/fpu: Move the fpstate_xstate_init_size() call into fpu__init_system()
x86/fpu: Call fpu__init_cpu_ctx_switch() from fpu__init_cpu()
x86/fpu: Do system-wide setup from fpu__detect()
x86/fpu: Remove fpu__init_cpu_ctx_switch() call from fpu__init_system()
x86/fpu: Simplify fpu__cpu_init()
x86/fpu: Factor out fpu__init_cpu_generic()
x86/fpu: Factor out fpu__init_system_generic()
x86/fpu: Factor out fpu__init_system_early_generic()
x86/fpu: Move !FPU check ingo fpu__init_system_early_generic()
x86/fpu: Factor out FPU bug checks into fpu/bugs.c
x86/fpu: Make check_fpu() init ordering independent
x86/fpu: Move fpu__init_system_early_generic() out of fpu__detect()
x86/fpu: Remove the extra fpu__detect() layer
x86/fpu: Rename fpstate_xstate_init_size() to fpu__init_system_xstate_size_legacy()
x86/fpu: Reorder init methods
x86/fpu: Add more comments to the FPU init code
x86/fpu: Move fpu__save() to fpu/internals.h
x86/fpu: Uninline kernel_fpu_begin()/end()
x86/fpu: Move various internal function prototypes to fpu/internal.h
x86/fpu: Uninline the irq_ts_save()/restore() functions
x86/fpu: Rename fpu_save_init() to copy_fpregs_to_fpstate()
x86/fpu: Optimize copy_fpregs_to_fpstate() by removing the FNCLEX synchronization with FP exceptions
x86/fpu: Simplify FPU handling by embedding the fpstate in task_struct (again)
x86/fpu: Remove failure paths from fpstate-alloc low level functions
x86/fpu: Remove failure return from fpstate_alloc_init()
x86/fpu: Rename fpstate_alloc_init() to fpstate_init_curr()
x86/fpu: Simplify fpu__unlazy_stopped() error handling
x86/fpu, kvm: Simplify fx_init()
x86/fpu: Simplify fpstate_init_curr() usage
x86/fpu: Rename fpu__unlazy_stopped() to fpu__activate_stopped()
x86/fpu: Factor out FPU hw activation/deactivation
x86/fpu: Simplify __save_fpu()
x86/fpu: Eliminate __save_fpu()
x86/fpu: Simplify fpu__save()
x86/fpu: Optimize fpu__save()
x86/fpu: Optimize fpu_copy()
x86/fpu: Optimize fpu_copy() some more on lazy switching systems
x86/fpu: Rename fpu/xsave.h to fpu/xstate.h
x86/fpu: Rename fpu/xsave.c to fpu/xstate.c
x86/fpu: Introduce cpu_has_xfeatures(xfeatures_mask, feature_name)
x86/fpu: Simplify print_xstate_features()
x86/fpu: Enumerate xfeature bits
x86/fpu: Move xfeature type enumeration to fpu/types.h
x86/fpu, crypto x86/camellia_aesni_avx: Simplify the camellia_aesni_init() xfeature checks
x86/fpu, crypto x86/sha256_ssse3: Simplify the sha256_ssse3_mod_init() xfeature checks
x86/fpu, crypto x86/camellia_aesni_avx2: Simplify the camellia_aesni_init() xfeature checks
x86/fpu, crypto x86/twofish_avx: Simplify the twofish_init() xfeature checks
x86/fpu, crypto x86/serpent_avx: Simplify the serpent_init() xfeature checks
x86/fpu, crypto x86/cast5_avx: Simplify the cast5_init() xfeature checks
x86/fpu, crypto x86/sha512_ssse3: Simplify the sha512_ssse3_mod_init() xfeature checks
x86/fpu, crypto x86/cast6_avx: Simplify the cast6_init() xfeature checks
x86/fpu, crypto x86/sha1_ssse3: Simplify the sha1_ssse3_mod_init() xfeature checks
x86/fpu, crypto x86/serpent_avx2: Simplify the init() xfeature checks
x86/fpu, crypto x86/sha1_mb: Remove FPU internal headers from sha1_mb.c
x86/fpu: Move asm/xcr.h to asm/fpu/internal.h
x86/fpu: Rename sanitize_i387_state() to fpstate_sanitize_xstate()
x86/fpu: Simplify fpstate_sanitize_xstate() calls
x86/fpu: Pass 'struct fpu' to fpstate_sanitize_xstate()
x86/fpu: Rename save_xstate_sig() to copy_fpstate_to_sigframe()
x86/fpu: Rename save_user_xstate() to copy_fpregs_to_sigframe()
x86/fpu: Clarify ancient comments in fpu__restore()
x86/fpu: Rename user_has_fpu() to fpregs_active()
x86/fpu: Initialize fpregs in fpu__init_cpu_generic()
x86/fpu: Clean up fpu__clear() state handling
x86/alternatives, x86/fpu: Add 'alternatives_patched' debug flag and use it in xsave_state()
x86/fpu: Synchronize the naming of drop_fpu() and fpu_reset_state()
x86/fpu: Rename restore_fpu_checking() to copy_fpstate_to_fpregs()
x86/fpu: Move all the fpu__*() high level methods closer to each other
x86/fpu: Move fpu__clear() to 'struct fpu *' parameter passing
x86/fpu: Rename restore_xstate_sig() to fpu__restore_sig()
x86/fpu: Move the signal frame handling code closer to each other
x86/fpu: Merge fpu__reset() and fpu__clear()
x86/fpu: Move is_ia32*frame() helpers out of fpu/internal.h
x86/fpu: Split out fpu/signal.h from fpu/internal.h for signal frame handling functions
x86/fpu: Factor out fpu/regset.h from fpu/internal.h
x86/fpu: Remove run-once init quirks
x86/fpu: Factor out the exception error code handling code
x86/fpu: Harmonize the names of the fpstate_init() helper functions
x86/fpu: Create 'union thread_xstate' helper for fpstate_init()
x86/fpu: Generalize 'init_xstate_ctx'
x86/fpu: Move restore_init_xstate() out of fpu/internal.h
x86/fpu: Rename all the fpregs, xregs, fxregs and fregs handling functions
x86/fpu: Factor out fpu/signal.c
x86/fpu: Factor out the FPU regset code into fpu/regset.c
x86/fpu: Harmonize FPU register state types
x86/fpu: Change fpu->fpregs_active from 'int' to 'char', add lazy switching comments
x86/fpu: Document the various fpregs state formats
x86/fpu: Move debugging check from kernel_fpu_begin() to __kernel_fpu_begin()
x86/fpu/xstate: Don't assume the first zero xfeatures zero bit means the end
x86/fpu: Clean up xstate feature reservation
x86/fpu/xstate: Clean up setup_xstate_comp() call
x86/fpu/init: Propagate __init annotations
x86/fpu: Pass 'struct fpu' to fpu__restore()
x86/fpu: Fix the 'nofxsr' boot parameter to also clear X86_FEATURE_FXSR_OPT
x86/fpu: Add CONFIG_X86_DEBUG_FPU=y FPU debugging code
x86/fpu: Add FPU performance measurement subsystem
x86/fpu: Reorganize fpu/internal.h

Documentation/preempt-locking.txt | 2 +-
arch/x86/Kconfig.debug | 27 ++
arch/x86/crypto/aesni-intel_glue.c | 2 +-
arch/x86/crypto/camellia_aesni_avx2_glue.c | 15 +-
arch/x86/crypto/camellia_aesni_avx_glue.c | 15 +-
arch/x86/crypto/cast5_avx_glue.c | 15 +-
arch/x86/crypto/cast6_avx_glue.c | 15 +-
arch/x86/crypto/crc32-pclmul_glue.c | 2 +-
arch/x86/crypto/crc32c-intel_glue.c | 3 +-
arch/x86/crypto/crct10dif-pclmul_glue.c | 2 +-
arch/x86/crypto/fpu.c | 2 +-
arch/x86/crypto/ghash-clmulni-intel_glue.c | 2 +-
arch/x86/crypto/serpent_avx2_glue.c | 15 +-
arch/x86/crypto/serpent_avx_glue.c | 15 +-
arch/x86/crypto/sha-mb/sha1_mb.c | 5 +-
arch/x86/crypto/sha1_ssse3_glue.c | 16 +-
arch/x86/crypto/sha256_ssse3_glue.c | 16 +-
arch/x86/crypto/sha512_ssse3_glue.c | 16 +-
arch/x86/crypto/twofish_avx_glue.c | 16 +-
arch/x86/ia32/ia32_signal.c | 13 +-
arch/x86/include/asm/alternative.h | 6 +
arch/x86/include/asm/crypto/glue_helper.h | 2 +-
arch/x86/include/asm/efi.h | 2 +-
arch/x86/include/asm/fpu-internal.h | 626 ---------------------------------------
arch/x86/include/asm/fpu/api.h | 48 +++
arch/x86/include/asm/fpu/internal.h | 488 ++++++++++++++++++++++++++++++
arch/x86/include/asm/fpu/measure.h | 13 +
arch/x86/include/asm/fpu/regset.h | 21 ++
arch/x86/include/asm/fpu/signal.h | 33 +++
arch/x86/include/asm/fpu/types.h | 293 ++++++++++++++++++
arch/x86/include/asm/{xsave.h => fpu/xstate.h} | 60 ++--
arch/x86/include/asm/i387.h | 108 -------
arch/x86/include/asm/kvm_host.h | 2 -
arch/x86/include/asm/mpx.h | 8 +-
arch/x86/include/asm/processor.h | 141 +--------
arch/x86/include/asm/simd.h | 2 +-
arch/x86/include/asm/stackprotector.h | 2 +
arch/x86/include/asm/suspend_32.h | 2 +-
arch/x86/include/asm/suspend_64.h | 2 +-
arch/x86/include/asm/user.h | 12 +-
arch/x86/include/asm/xcr.h | 49 ---
arch/x86/include/asm/xor.h | 2 +-
arch/x86/include/asm/xor_32.h | 2 +-
arch/x86/include/asm/xor_avx.h | 2 +-
arch/x86/include/uapi/asm/sigcontext.h | 8 +-
arch/x86/kernel/Makefile | 2 +-
arch/x86/kernel/alternative.c | 5 +
arch/x86/kernel/cpu/bugs.c | 57 +---
arch/x86/kernel/cpu/bugs_64.c | 2 +
arch/x86/kernel/cpu/common.c | 29 +-
arch/x86/kernel/fpu/Makefile | 11 +
arch/x86/kernel/fpu/bugs.c | 71 +++++
arch/x86/kernel/fpu/core.c | 509 +++++++++++++++++++++++++++++++
arch/x86/kernel/fpu/init.c | 288 ++++++++++++++++++
arch/x86/kernel/fpu/measure.c | 509 +++++++++++++++++++++++++++++++
arch/x86/kernel/fpu/regset.c | 356 ++++++++++++++++++++++
arch/x86/kernel/fpu/signal.c | 404 +++++++++++++++++++++++++
arch/x86/kernel/fpu/xstate.c | 406 +++++++++++++++++++++++++
arch/x86/kernel/i387.c | 656 ----------------------------------------
arch/x86/kernel/process.c | 52 +---
arch/x86/kernel/process_32.c | 15 +-
arch/x86/kernel/process_64.c | 13 +-
arch/x86/kernel/ptrace.c | 12 +-
arch/x86/kernel/signal.c | 38 ++-
arch/x86/kernel/smpboot.c | 3 +-
arch/x86/kernel/traps.c | 120 ++------
arch/x86/kernel/xsave.c | 724 ---------------------------------------------
arch/x86/kvm/cpuid.c | 2 +-
arch/x86/kvm/vmx.c | 5 +-
arch/x86/kvm/x86.c | 68 ++---
arch/x86/lguest/boot.c | 2 +-
arch/x86/lib/mmx_32.c | 2 +-
arch/x86/math-emu/fpu_aux.c | 4 +-
arch/x86/math-emu/fpu_entry.c | 20 +-
arch/x86/math-emu/fpu_system.h | 2 +-
arch/x86/mm/mpx.c | 15 +-
arch/x86/power/cpu.c | 11 +-
arch/x86/xen/enlighten.c | 2 +-
drivers/char/hw_random/via-rng.c | 2 +-
drivers/crypto/padlock-aes.c | 2 +-
drivers/crypto/padlock-sha.c | 2 +-
drivers/lguest/x86/core.c | 12 +-
lib/raid6/x86.h | 2 +-
83 files changed, 3742 insertions(+), 2841 deletions(-)
delete mode 100644 arch/x86/include/asm/fpu-internal.h
create mode 100644 arch/x86/include/asm/fpu/api.h
create mode 100644 arch/x86/include/asm/fpu/internal.h
create mode 100644 arch/x86/include/asm/fpu/measure.h
create mode 100644 arch/x86/include/asm/fpu/regset.h
create mode 100644 arch/x86/include/asm/fpu/signal.h
create mode 100644 arch/x86/include/asm/fpu/types.h
rename arch/x86/include/asm/{xsave.h => fpu/xstate.h} (77%)
delete mode 100644 arch/x86/include/asm/i387.h
delete mode 100644 arch/x86/include/asm/xcr.h
create mode 100644 arch/x86/kernel/fpu/Makefile
create mode 100644 arch/x86/kernel/fpu/bugs.c
create mode 100644 arch/x86/kernel/fpu/core.c
create mode 100644 arch/x86/kernel/fpu/init.c
create mode 100644 arch/x86/kernel/fpu/measure.c
create mode 100644 arch/x86/kernel/fpu/regset.c
create mode 100644 arch/x86/kernel/fpu/signal.c
create mode 100644 arch/x86/kernel/fpu/xstate.c
delete mode 100644 arch/x86/kernel/i387.c
delete mode 100644 arch/x86/kernel/xsave.c

--
2.1.0


2015-05-05 17:58:59

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 162/208] x86/fpu, crypto x86/cast6_avx: Simplify the cast6_init() xfeature checks

Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.

This has the following advantages to the driver:

- Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.

- Removes detection complexity from the driver, no more raw XGETBV instruction

- Shrinks the code a bit:

text data bss dec hex filename
2128 2896 0 5024 13a0 camellia_aesni_avx_glue.o.before
2067 2896 0 4963 1363 camellia_aesni_avx_glue.o.after

- Standardizes feature name error message printouts across drivers

There are also advantages to the x86 FPU code: once all drivers
are decoupled from internals we can move them out of common
headers and we'll also be able to remove xcr.h.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/crypto/cast6_avx_glue.c | 15 ++++-----------
1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/arch/x86/crypto/cast6_avx_glue.c b/arch/x86/crypto/cast6_avx_glue.c
index 21d0b845c8c4..5dbba7224221 100644
--- a/arch/x86/crypto/cast6_avx_glue.c
+++ b/arch/x86/crypto/cast6_avx_glue.c
@@ -36,8 +36,7 @@
#include <crypto/ctr.h>
#include <crypto/lrw.h>
#include <crypto/xts.h>
-#include <asm/xcr.h>
-#include <asm/fpu/xstate.h>
+#include <asm/fpu/api.h>
#include <asm/crypto/glue_helper.h>

#define CAST6_PARALLEL_BLOCKS 8
@@ -590,16 +589,10 @@ static struct crypto_alg cast6_algs[10] = { {

static int __init cast6_init(void)
{
- u64 xcr0;
+ const char *feature_name;

- if (!cpu_has_avx || !cpu_has_osxsave) {
- pr_info("AVX instructions are not detected.\n");
- return -ENODEV;
- }
-
- xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
- if ((xcr0 & (XSTATE_SSE | XSTATE_YMM)) != (XSTATE_SSE | XSTATE_YMM)) {
- pr_info("AVX detected but unusable.\n");
+ if (!cpu_has_xfeatures(XSTATE_SSE | XSTATE_YMM, &feature_name)) {
+ pr_info("CPU feature '%s' is not supported.\n", feature_name);
return -ENODEV;
}

--
2.1.0

2015-05-05 18:15:43

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 163/208] x86/fpu, crypto x86/sha1_ssse3: Simplify the sha1_ssse3_mod_init() xfeature checks

Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.

This has the following advantages to the driver:

- Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.

- Removes detection complexity from the driver, no more raw XGETBV instruction

- Shrinks the code a bit.

- Standardizes feature name error message printouts across drivers

There are also advantages to the x86 FPU code: once all drivers
are decoupled from internals we can move them out of common
headers and we'll also be able to remove xcr.h.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/crypto/sha1_ssse3_glue.c | 14 +++-----------
1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/arch/x86/crypto/sha1_ssse3_glue.c b/arch/x86/crypto/sha1_ssse3_glue.c
index 84db12f052e8..7c48e8b20848 100644
--- a/arch/x86/crypto/sha1_ssse3_glue.c
+++ b/arch/x86/crypto/sha1_ssse3_glue.c
@@ -30,8 +30,6 @@
#include <crypto/sha.h>
#include <crypto/sha1_base.h>
#include <asm/fpu/api.h>
-#include <asm/xcr.h>
-#include <asm/fpu/xstate.h>


asmlinkage void sha1_transform_ssse3(u32 *digest, const char *data,
@@ -123,15 +121,9 @@ static struct shash_alg alg = {
#ifdef CONFIG_AS_AVX
static bool __init avx_usable(void)
{
- u64 xcr0;
-
- if (!cpu_has_avx || !cpu_has_osxsave)
- return false;
-
- xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
- if ((xcr0 & (XSTATE_SSE | XSTATE_YMM)) != (XSTATE_SSE | XSTATE_YMM)) {
- pr_info("AVX detected but unusable.\n");
-
+ if (!cpu_has_xfeatures(XSTATE_SSE | XSTATE_YMM, NULL)) {
+ if (cpu_has_avx)
+ pr_info("AVX detected but unusable.\n");
return false;
}

--
2.1.0

2015-05-05 17:59:04

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 164/208] x86/fpu, crypto x86/serpent_avx2: Simplify the init() xfeature checks

Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.

This has the following advantages to the driver:

- Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.

- Removes detection complexity from the driver, no more raw XGETBV instruction

- Shrinks the code a bit.

- Standardizes feature name error message printouts across drivers

There are also advantages to the x86 FPU code: once all drivers
are decoupled from internals we can move them out of common
headers and we'll also be able to remove xcr.h.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/crypto/serpent_avx2_glue.c | 15 ++++-----------
1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/arch/x86/crypto/serpent_avx2_glue.c b/arch/x86/crypto/serpent_avx2_glue.c
index aa325fa5c7a6..f226ad41fde1 100644
--- a/arch/x86/crypto/serpent_avx2_glue.c
+++ b/arch/x86/crypto/serpent_avx2_glue.c
@@ -20,8 +20,7 @@
#include <crypto/lrw.h>
#include <crypto/xts.h>
#include <crypto/serpent.h>
-#include <asm/xcr.h>
-#include <asm/fpu/xstate.h>
+#include <asm/fpu/api.h>
#include <asm/crypto/serpent-avx.h>
#include <asm/crypto/glue_helper.h>

@@ -537,16 +536,10 @@ static struct crypto_alg srp_algs[10] = { {

static int __init init(void)
{
- u64 xcr0;
+ const char *feature_name;

- if (!cpu_has_avx2 || !cpu_has_osxsave) {
- pr_info("AVX2 instructions are not detected.\n");
- return -ENODEV;
- }
-
- xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
- if ((xcr0 & (XSTATE_SSE | XSTATE_YMM)) != (XSTATE_SSE | XSTATE_YMM)) {
- pr_info("AVX detected but unusable.\n");
+ if (!cpu_has_xfeatures(XSTATE_SSE | XSTATE_YMM, &feature_name)) {
+ pr_info("CPU feature '%s' is not supported.\n", feature_name);
return -ENODEV;
}

--
2.1.0

2015-05-05 18:15:15

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 165/208] x86/fpu, crypto x86/sha1_mb: Remove FPU internal headers from sha1_mb.c

This file only uses the public FPU APIs, so remove the xcr.h, fpu/xstate.h
and fpu/internal.h headers and add the fpu/api.h include.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/crypto/sha-mb/sha1_mb.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/x86/crypto/sha-mb/sha1_mb.c b/arch/x86/crypto/sha-mb/sha1_mb.c
index 6f3f76568bd5..f53ed1dc88ea 100644
--- a/arch/x86/crypto/sha-mb/sha1_mb.c
+++ b/arch/x86/crypto/sha-mb/sha1_mb.c
@@ -65,10 +65,8 @@
#include <crypto/mcryptd.h>
#include <crypto/crypto_wq.h>
#include <asm/byteorder.h>
-#include <asm/xcr.h>
-#include <asm/fpu/xstate.h>
#include <linux/hardirq.h>
-#include <asm/fpu/internal.h>
+#include <asm/fpu/api.h>
#include "sha_mb_ctx.h"

#define FLUSH_INTERVAL 1000 /* in usec */
--
2.1.0

2015-05-05 17:59:09

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 166/208] x86/fpu: Move asm/xcr.h to asm/fpu/internal.h

Now that all FPU internals using drivers are converted to public APIs,
move xcr.h's definitions into fpu/internal.h and remove xcr.h.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/fpu/internal.h | 25 +++++++++++++++++++++++++
arch/x86/include/asm/xcr.h | 49 -------------------------------------------------
arch/x86/kernel/fpu/xstate.c | 1 -
arch/x86/kvm/vmx.c | 1 -
arch/x86/kvm/x86.c | 1 -
5 files changed, 25 insertions(+), 52 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 8ec785ecce81..161b51bf267e 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -431,6 +431,31 @@ static inline void fpu_reset_state(struct fpu *fpu)
}

/*
+ * Definitions for the eXtended Control Register instructions
+ */
+
+#define XCR_XFEATURE_ENABLED_MASK 0x00000000
+
+static inline u64 xgetbv(u32 index)
+{
+ u32 eax, edx;
+
+ asm volatile(".byte 0x0f,0x01,0xd0" /* xgetbv */
+ : "=a" (eax), "=d" (edx)
+ : "c" (index));
+ return eax + ((u64)edx << 32);
+}
+
+static inline void xsetbv(u32 index, u64 value)
+{
+ u32 eax = value;
+ u32 edx = value >> 32;
+
+ asm volatile(".byte 0x0f,0x01,0xd1" /* xsetbv */
+ : : "a" (eax), "d" (edx), "c" (index));
+}
+
+/*
* FPU state switching for scheduling.
*
* This is a two-stage process:
diff --git a/arch/x86/include/asm/xcr.h b/arch/x86/include/asm/xcr.h
deleted file mode 100644
index f2cba4e79a23..000000000000
--- a/arch/x86/include/asm/xcr.h
+++ /dev/null
@@ -1,49 +0,0 @@
-/* -*- linux-c -*- ------------------------------------------------------- *
- *
- * Copyright 2008 rPath, Inc. - All Rights Reserved
- *
- * This file is part of the Linux kernel, and is made available under
- * the terms of the GNU General Public License version 2 or (at your
- * option) any later version; incorporated herein by reference.
- *
- * ----------------------------------------------------------------------- */
-
-/*
- * asm-x86/xcr.h
- *
- * Definitions for the eXtended Control Register instructions
- */
-
-#ifndef _ASM_X86_XCR_H
-#define _ASM_X86_XCR_H
-
-#define XCR_XFEATURE_ENABLED_MASK 0x00000000
-
-#ifdef __KERNEL__
-# ifndef __ASSEMBLY__
-
-#include <linux/types.h>
-
-static inline u64 xgetbv(u32 index)
-{
- u32 eax, edx;
-
- asm volatile(".byte 0x0f,0x01,0xd0" /* xgetbv */
- : "=a" (eax), "=d" (edx)
- : "c" (index));
- return eax + ((u64)edx << 32);
-}
-
-static inline void xsetbv(u32 index, u64 value)
-{
- u32 eax = value;
- u32 edx = value >> 32;
-
- asm volatile(".byte 0x0f,0x01,0xd1" /* xsetbv */
- : : "a" (eax), "d" (edx), "c" (index));
-}
-
-# endif /* __ASSEMBLY__ */
-#endif /* __KERNEL__ */
-
-#endif /* _ASM_X86_XCR_H */
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 0f849229c93b..c087f2d0f2d1 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -9,7 +9,6 @@
#include <asm/fpu/internal.h>
#include <asm/sigframe.h>
#include <asm/tlbflush.h>
-#include <asm/xcr.h>

static const char *xfeature_names[] =
{
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index f93ae71416e4..2de55e953842 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -41,7 +41,6 @@
#include <asm/virtext.h>
#include <asm/mce.h>
#include <asm/fpu/internal.h>
-#include <asm/xcr.h>
#include <asm/perf_event.h>
#include <asm/debugreg.h>
#include <asm/kexec.h>
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9ff4df77e069..5c61aae277f9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -61,7 +61,6 @@
#include <asm/mce.h>
#include <linux/kernel_stat.h>
#include <asm/fpu/internal.h> /* Ugh! */
-#include <asm/xcr.h>
#include <asm/pvclock.h>
#include <asm/div64.h>

--
2.1.0

2015-05-05 17:59:12

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 167/208] x86/fpu: Rename sanitize_i387_state() to fpstate_sanitize_xstate()

So the sanitize_i387_state() function has the following purpose:
on CPUs that support optimized xstate saving instructions, an
FPU fpstate might end up having partially uninitialized data.

This function initializes that data.

Note that the function name is a misnomer and confusing on two levels,
not only is it not i387 specific at all, but it is the exact opposite:
it only matters on xstate CPUs.

So rename sanitize_i387_state() and __sanitize_i387_state() to
fpstate_sanitize_xstate() and __fpstate_sanitize_xstate(),
to clearly express the purpose and usage of the function.

We'll further clean up this function in the next patch.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/fpu/internal.h | 6 +++---
arch/x86/kernel/fpu/core.c | 8 ++++----
arch/x86/kernel/fpu/xstate.c | 4 ++--
3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 161b51bf267e..6b6fa46037f8 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -139,13 +139,13 @@ static inline void fx_finit(struct i387_fxsave_struct *fx)
fx->mxcsr = MXCSR_DEFAULT;
}

-extern void __sanitize_i387_state(struct task_struct *);
+extern void __fpstate_sanitize_xstate(struct task_struct *);

-static inline void sanitize_i387_state(struct task_struct *tsk)
+static inline void fpstate_sanitize_xstate(struct task_struct *tsk)
{
if (!use_xsaveopt())
return;
- __sanitize_i387_state(tsk);
+ __fpstate_sanitize_xstate(tsk);
}

#define user_insn(insn, output, input...) \
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 8ae4c2450c2b..9ccf2b838de0 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -395,7 +395,7 @@ int xfpregs_get(struct task_struct *target, const struct user_regset *regset,
return -ENODEV;

fpu__activate_stopped(fpu);
- sanitize_i387_state(target);
+ fpstate_sanitize_xstate(target);

return user_regset_copyout(&pos, &count, &kbuf, &ubuf,
&fpu->state.fxsave, 0, -1);
@@ -412,7 +412,7 @@ int xfpregs_set(struct task_struct *target, const struct user_regset *regset,
return -ENODEV;

fpu__activate_stopped(fpu);
- sanitize_i387_state(target);
+ fpstate_sanitize_xstate(target);

ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
&fpu->state.fxsave, 0, -1);
@@ -644,7 +644,7 @@ int fpregs_get(struct task_struct *target, const struct user_regset *regset,
&fpu->state.fsave, 0,
-1);

- sanitize_i387_state(target);
+ fpstate_sanitize_xstate(target);

if (kbuf && pos == 0 && count == sizeof(env)) {
convert_from_fxsr(kbuf, target);
@@ -666,7 +666,7 @@ int fpregs_set(struct task_struct *target, const struct user_regset *regset,

fpu__activate_stopped(fpu);

- sanitize_i387_state(target);
+ fpstate_sanitize_xstate(target);

if (!static_cpu_has(X86_FEATURE_FPU))
return fpregs_soft_set(target, regset, pos, count, kbuf, ubuf);
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index c087f2d0f2d1..fc2ff1239fea 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -92,7 +92,7 @@ EXPORT_SYMBOL_GPL(cpu_has_xfeatures);
* if the corresponding header bit is zero. This is to ensure that user-space doesn't
* see some stale state in the memory layout during signal handling, debugging etc.
*/
-void __sanitize_i387_state(struct task_struct *tsk)
+void __fpstate_sanitize_xstate(struct task_struct *tsk)
{
struct i387_fxsave_struct *fx = &tsk->thread.fpu.state.fxsave;
int feature_bit;
@@ -318,7 +318,7 @@ int save_xstate_sig(void __user *buf, void __user *buf_fx, int size)
if (ia32_fxstate)
fpu_fxsave(&tsk->thread.fpu);
} else {
- sanitize_i387_state(tsk);
+ fpstate_sanitize_xstate(tsk);
if (__copy_to_user(buf_fx, xsave, xstate_size))
return -1;
}
--
2.1.0

2015-05-05 17:59:15

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 168/208] x86/fpu: Simplify fpstate_sanitize_xstate() calls

Remove the extra layer of __fpstate_sanitize_xstate():

if (!use_xsaveopt())
return;
__fpstate_sanitize_xstate(tsk);

and move the check for use_xsaveopt() into fpstate_sanitize_xstate().

In general we optimize for the presence of CPU features, not for
the absence of them. Furthermore there's little point in this inlining,
as the call sites are not super hot code paths.

Doing this uninlining shrinks the code a bit:

text data bss dec hex filename
14108751 2573624 1634304 18316679 1177d87 vmlinux.before
14108627 2573624 1634304 18316555 1177d0b vmlinux.after

Also remove a pointless '!fx' check from fpstate_sanitize_xstate().

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/fpu/internal.h | 9 +--------
arch/x86/kernel/fpu/xstate.c | 4 ++--
2 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 6b6fa46037f8..88fec3f108de 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -139,14 +139,7 @@ static inline void fx_finit(struct i387_fxsave_struct *fx)
fx->mxcsr = MXCSR_DEFAULT;
}

-extern void __fpstate_sanitize_xstate(struct task_struct *);
-
-static inline void fpstate_sanitize_xstate(struct task_struct *tsk)
-{
- if (!use_xsaveopt())
- return;
- __fpstate_sanitize_xstate(tsk);
-}
+extern void fpstate_sanitize_xstate(struct task_struct *);

#define user_insn(insn, output, input...) \
({ \
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index fc2ff1239fea..47b9591947e1 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -92,13 +92,13 @@ EXPORT_SYMBOL_GPL(cpu_has_xfeatures);
* if the corresponding header bit is zero. This is to ensure that user-space doesn't
* see some stale state in the memory layout during signal handling, debugging etc.
*/
-void __fpstate_sanitize_xstate(struct task_struct *tsk)
+void fpstate_sanitize_xstate(struct task_struct *tsk)
{
struct i387_fxsave_struct *fx = &tsk->thread.fpu.state.fxsave;
int feature_bit;
u64 xfeatures;

- if (!fx)
+ if (!use_xsaveopt())
return;

xfeatures = tsk->thread.fpu.state.xsave.header.xfeatures;
--
2.1.0

2015-05-05 18:14:35

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 169/208] x86/fpu: Pass 'struct fpu' to fpstate_sanitize_xstate()

Currently fpstate_sanitize_xstate() has a task_struct input parameter,
but it only uses the fpu structure from it - so pass in a 'struct fpu'
pointer only and update all call sites.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/fpu/internal.h | 2 +-
arch/x86/kernel/fpu/core.c | 9 ++++-----
arch/x86/kernel/fpu/xstate.c | 8 ++++----
3 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 88fec3f108de..da96b0cbfcb3 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -139,7 +139,7 @@ static inline void fx_finit(struct i387_fxsave_struct *fx)
fx->mxcsr = MXCSR_DEFAULT;
}

-extern void fpstate_sanitize_xstate(struct task_struct *);
+extern void fpstate_sanitize_xstate(struct fpu *fpu);

#define user_insn(insn, output, input...) \
({ \
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 9ccf2b838de0..7e91a6f7564a 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -395,7 +395,7 @@ int xfpregs_get(struct task_struct *target, const struct user_regset *regset,
return -ENODEV;

fpu__activate_stopped(fpu);
- fpstate_sanitize_xstate(target);
+ fpstate_sanitize_xstate(fpu);

return user_regset_copyout(&pos, &count, &kbuf, &ubuf,
&fpu->state.fxsave, 0, -1);
@@ -412,7 +412,7 @@ int xfpregs_set(struct task_struct *target, const struct user_regset *regset,
return -ENODEV;

fpu__activate_stopped(fpu);
- fpstate_sanitize_xstate(target);
+ fpstate_sanitize_xstate(fpu);

ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
&fpu->state.fxsave, 0, -1);
@@ -644,7 +644,7 @@ int fpregs_get(struct task_struct *target, const struct user_regset *regset,
&fpu->state.fsave, 0,
-1);

- fpstate_sanitize_xstate(target);
+ fpstate_sanitize_xstate(fpu);

if (kbuf && pos == 0 && count == sizeof(env)) {
convert_from_fxsr(kbuf, target);
@@ -665,8 +665,7 @@ int fpregs_set(struct task_struct *target, const struct user_regset *regset,
int ret;

fpu__activate_stopped(fpu);
-
- fpstate_sanitize_xstate(target);
+ fpstate_sanitize_xstate(fpu);

if (!static_cpu_has(X86_FEATURE_FPU))
return fpregs_soft_set(target, regset, pos, count, kbuf, ubuf);
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 47b9591947e1..a8ce38a9d70b 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -92,16 +92,16 @@ EXPORT_SYMBOL_GPL(cpu_has_xfeatures);
* if the corresponding header bit is zero. This is to ensure that user-space doesn't
* see some stale state in the memory layout during signal handling, debugging etc.
*/
-void fpstate_sanitize_xstate(struct task_struct *tsk)
+void fpstate_sanitize_xstate(struct fpu *fpu)
{
- struct i387_fxsave_struct *fx = &tsk->thread.fpu.state.fxsave;
+ struct i387_fxsave_struct *fx = &fpu->state.fxsave;
int feature_bit;
u64 xfeatures;

if (!use_xsaveopt())
return;

- xfeatures = tsk->thread.fpu.state.xsave.header.xfeatures;
+ xfeatures = fpu->state.xsave.header.xfeatures;

/*
* None of the feature bits are in init state. So nothing else
@@ -318,7 +318,7 @@ int save_xstate_sig(void __user *buf, void __user *buf_fx, int size)
if (ia32_fxstate)
fpu_fxsave(&tsk->thread.fpu);
} else {
- fpstate_sanitize_xstate(tsk);
+ fpstate_sanitize_xstate(&tsk->thread.fpu);
if (__copy_to_user(buf_fx, xsave, xstate_size))
return -1;
}
--
2.1.0

2015-05-05 17:59:19

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 170/208] x86/fpu: Rename save_xstate_sig() to copy_fpstate_to_sigframe()

Standardize the naming of save_xstate_sig() by renaming it to
copy_fpstate_to_sigframe(): this tells us at a glance that
the function copies an FPU fpstate to a signal frame.

This naming also follows the naming of copy_fpregs_to_fpstate().

Don't put 'xstate' into the name: since this is a generic name,
it's expected that the function is able to handle xstate frames
as well, beyond legacy frames.

xstate used to be the odd case in the x86 FPU code - now it's the
common case.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/ia32/ia32_signal.c | 2 +-
arch/x86/include/asm/fpu/internal.h | 2 +-
arch/x86/kernel/fpu/xstate.c | 2 +-
arch/x86/kernel/signal.c | 2 +-
4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c
index d6d8f4ca5136..2e0b1b7842ae 100644
--- a/arch/x86/ia32/ia32_signal.c
+++ b/arch/x86/ia32/ia32_signal.c
@@ -327,7 +327,7 @@ static void __user *get_sigframe(struct ksignal *ksig, struct pt_regs *regs,

sp = alloc_mathframe(sp, 1, &fx_aligned, &math_size);
*fpstate = (struct _fpstate_ia32 __user *) sp;
- if (save_xstate_sig(*fpstate, (void __user *)fx_aligned,
+ if (copy_fpstate_to_sigframe(*fpstate, (void __user *)fx_aligned,
math_size) < 0)
return (void __user *) -1L;
}
diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index da96b0cbfcb3..58c274dfcb62 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -523,7 +523,7 @@ static inline void switch_fpu_finish(struct fpu *new_fpu, fpu_switch_t fpu_switc
/*
* Signal frame handlers...
*/
-extern int save_xstate_sig(void __user *buf, void __user *fx, int size);
+extern int copy_fpstate_to_sigframe(void __user *buf, void __user *fx, int size);
extern int __restore_xstate_sig(void __user *buf, void __user *fx, int size);

static inline int xstate_sigframe_size(void)
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index a8ce38a9d70b..5178158a03ff 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -293,7 +293,7 @@ static inline int save_user_xstate(struct xsave_struct __user *buf)
* For [f]xsave state, update the SW reserved fields in the [f]xsave frame
* indicating the absence/presence of the extended state to the user.
*/
-int save_xstate_sig(void __user *buf, void __user *buf_fx, int size)
+int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
{
struct xsave_struct *xsave = &current->thread.fpu.state.xsave;
struct task_struct *tsk = current;
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index c67f96c87938..59cfc9c97491 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -235,7 +235,7 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,

/* save i387 and extended state */
if (fpu->fpstate_active &&
- save_xstate_sig(*fpstate, (void __user *)buf_fx, math_size) < 0)
+ copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size) < 0)
return (void __user *)-1L;

return (void __user *)sp;
--
2.1.0

2015-05-05 18:14:11

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 171/208] x86/fpu: Rename save_user_xstate() to copy_fpregs_to_sigframe()

Move the naming in line with existing names, so that we now have:

copy_fpregs_to_fpstate()
copy_fpstate_to_sigframe()
copy_fpregs_to_sigframe()

... where each function does what its name suggests.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/fpu/xstate.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 5178158a03ff..28638820ed0e 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -257,7 +257,7 @@ static inline int save_xstate_epilog(void __user *buf, int ia32_frame)
return err;
}

-static inline int save_user_xstate(struct xsave_struct __user *buf)
+static inline int copy_fpregs_to_sigframe(struct xsave_struct __user *buf)
{
int err;

@@ -312,7 +312,7 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)

if (user_has_fpu()) {
/* Save the live register state to the user directly. */
- if (save_user_xstate(buf_fx))
+ if (copy_fpregs_to_sigframe(buf_fx))
return -1;
/* Update the thread's fxstate to save the fsave header. */
if (ia32_fxstate)
--
2.1.0

2015-05-05 18:13:49

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 172/208] x86/fpu: Clarify ancient comments in fpu__restore()

So this function still had ancient language about 'saving current
math information' - but we haven't been doing lazy FPU saves for
quite some time, we are doing lazy FPU restores.

Also remove IRQ13 related comment, which we don't support anymore
either.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/fpu/core.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 7e91a6f7564a..45ef4e51928b 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -319,14 +319,14 @@ static void fpu__activate_stopped(struct fpu *child_fpu)
}

/*
- * 'fpu__restore()' saves the current math information in the
- * old math state array, and gets the new ones from the current task
+ * 'fpu__restore()' is called to copy FPU registers from
+ * the FPU fpstate to the live hw registers and to activate
+ * access to the hardware registers, so that FPU instructions
+ * can be used afterwards.
*
- * Careful.. There are problems with IBM-designed IRQ13 behaviour.
- * Don't touch unless you *really* know how it works.
- *
- * Must be called with kernel preemption disabled (eg with local
- * local interrupts as in the case of do_device_not_available).
+ * Must be called with kernel preemption disabled (for example
+ * with local interrupts disabled, as it is in the case of
+ * do_device_not_available()).
*/
void fpu__restore(void)
{
--
2.1.0

2015-05-05 18:13:21

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 173/208] x86/fpu: Rename user_has_fpu() to fpregs_active()

Rename this function in line with the new FPU nomenclature.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/fpu/internal.h | 4 ++--
arch/x86/kernel/fpu/xstate.c | 2 +-
arch/x86/kvm/vmx.c | 2 +-
drivers/lguest/x86/core.c | 6 +++---
4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 58c274dfcb62..0f17cd4e4e58 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -358,7 +358,7 @@ static inline void __fpregs_activate(struct fpu *fpu)
* to save the FP state - we'll just take a #NM
* fault and get the FPU access back.
*/
-static inline int user_has_fpu(void)
+static inline int fpregs_active(void)
{
return current->thread.fpu.fpregs_active;
}
@@ -557,7 +557,7 @@ static inline void user_fpu_begin(void)
struct fpu *fpu = &current->thread.fpu;

preempt_disable();
- if (!user_has_fpu())
+ if (!fpregs_active())
fpregs_activate(fpu);
preempt_enable();
}
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 28638820ed0e..b8e5fee2aef3 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -310,7 +310,7 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
sizeof(struct user_i387_ia32_struct), NULL,
(struct _fpstate_ia32 __user *) buf) ? -1 : 1;

- if (user_has_fpu()) {
+ if (fpregs_active()) {
/* Save the live register state to the user directly. */
if (copy_fpregs_to_sigframe(buf_fx))
return -1;
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 2de55e953842..1c384bf856e5 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1882,7 +1882,7 @@ static void __vmx_load_host_state(struct vcpu_vmx *vmx)
* If the FPU is not active (through the host task or
* the guest vcpu), then restore the cr0.TS bit.
*/
- if (!user_has_fpu() && !vmx->vcpu.guest_fpu_loaded)
+ if (!fpregs_active() && !vmx->vcpu.guest_fpu_loaded)
stts();
load_gdt(this_cpu_ptr(&host_gdt));
}
diff --git a/drivers/lguest/x86/core.c b/drivers/lguest/x86/core.c
index b80e4b8c9b6e..99bb3009e2d5 100644
--- a/drivers/lguest/x86/core.c
+++ b/drivers/lguest/x86/core.c
@@ -251,7 +251,7 @@ void lguest_arch_run_guest(struct lg_cpu *cpu)
* we set it now, so we can trap and pass that trap to the Guest if it
* uses the FPU.
*/
- if (cpu->ts && user_has_fpu())
+ if (cpu->ts && fpregs_active())
stts();

/*
@@ -283,7 +283,7 @@ void lguest_arch_run_guest(struct lg_cpu *cpu)
wrmsr(MSR_IA32_SYSENTER_CS, __KERNEL_CS, 0);

/* Clear the host TS bit if it was set above. */
- if (cpu->ts && user_has_fpu())
+ if (cpu->ts && fpregs_active())
clts();

/*
@@ -301,7 +301,7 @@ void lguest_arch_run_guest(struct lg_cpu *cpu)
* a different CPU. So all the critical stuff should be done
* before this.
*/
- else if (cpu->regs->trapnum == 7 && !user_has_fpu())
+ else if (cpu->regs->trapnum == 7 && !fpregs_active())
fpu__restore();
}

--
2.1.0

2015-05-05 18:13:02

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 174/208] x86/fpu: Initialize fpregs in fpu__init_cpu_generic()

FPU fpregs do not get initialized during bootup on secondary CPUs,
on non-xsave capable CPUs.

For example on one of my systems, the secondary CPU has this FPU
state on bootup:

x86: Booting SMP configuration:
.... node #0, CPUs: #1
x86/fpu ######################
x86/fpu # FPU register dump on CPU#1:
x86/fpu # ... CWD: ffff0040
x86/fpu # ... SWD: ffff0000
x86/fpu # ... TWD: ffff555a
x86/fpu # ... FIP: 00000000
x86/fpu # ... FCS: 00000000
x86/fpu # ... FOO: 00000000
x86/fpu # ... FOS: ffff0000
x86/fpu # ... FP0: 02 57 00 00 00 00 00 00 ff ff
x86/fpu # ... FP1: 1b e2 00 00 00 00 00 00 ff ff
x86/fpu # ... FP2: 00 00 00 00 00 00 00 00 00 00
x86/fpu # ... FP3: 00 00 00 00 00 00 00 00 00 00
x86/fpu # ... FP4: 00 00 00 00 00 00 00 00 00 00
x86/fpu # ... FP5: 00 00 00 00 00 00 00 00 00 00
x86/fpu # ... FP6: 00 00 00 00 00 00 00 00 00 00
x86/fpu # ... FP7: 00 00 00 00 00 00 00 00 00 00
x86/fpu # ... SW: dadadada
x86/fpu ######################

Note how CWD and TWD are off their usual init state (0x037f and 0xffff),
and how FP0 and FP1 has non-zero content.

This is normally not a problem, because any user-space FPU state
is initalized properly - but it can complicate the use of FPU
instructions in kernel code via kernel_fpu_begin()/end(): if
the FPU using code does not initialize registers itself, it
might generate spurious exceptions depending on which CPU it
executes on.

Fix this by initializing the x87 state via the FNINIT instruction.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/fpu/init.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index 460e7e2c6186..72219ce2385a 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -36,6 +36,9 @@ static void fpu__init_cpu_generic(void)
if (!cpu_has_fpu)
cr0 |= X86_CR0_EM;
write_cr0(cr0);
+
+ /* Flush out any pending x87 state: */
+ asm volatile ("fninit");
}

/*
--
2.1.0

2015-05-05 18:12:44

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 175/208] x86/fpu: Clean up fpu__clear() state handling

We currently leak FPU state across execve() boundaries on eagerfpu systems:

$ /host/home/mingo/dump-xmm-regs-exec
# XMM state before execve():
XMM0 : 000000000000dede
XMM1 : 000000000000dedf
XMM2 : 000000000000dee0
XMM3 : 000000000000dee1
XMM4 : 000000000000dee2
XMM5 : 000000000000dee3
XMM6 : 000000000000dee4
XMM7 : 000000000000dee5
XMM8 : 000000000000dee6
XMM9 : 000000000000dee7
XMM10: 000000000000dee8
XMM11: 000000000000dee9
XMM12: 000000000000deea
XMM13: 000000000000deeb
XMM14: 000000000000deec
XMM15: 000000000000deed

# XMM state after execve(), in the new task context:
XMM0 : 0000000000000000
XMM1 : 2f2f2f2f2f2f2f2f
XMM2 : 0000000000000000
XMM3 : 0000000000000000
XMM4 : 00000000000000ff
XMM5 : 00000000ff000000
XMM6 : 000000000000dee4
XMM7 : 000000000000dee5
XMM8 : 0000000000000000
XMM9 : 0000000000000000
XMM10: 0000000000000000
XMM11: 0000000000000000
XMM12: 0000000000000000
XMM13: 000000000000deeb
XMM14: 000000000000deec
XMM15: 000000000000deed

The reason is that fpu__clear() does not clear out the state properly.

Fix it.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/fpu/core.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 45ef4e51928b..33c9a43b000e 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -348,6 +348,10 @@ void fpu__restore(void)
}
EXPORT_SYMBOL_GPL(fpu__restore);

+/*
+ * Called by sys_execve() to clear the FPU fpregs, so that FPU state
+ * of the previous binary does not leak over into the exec()ed binary:
+ */
void fpu__clear(struct task_struct *tsk)
{
struct fpu *fpu = &tsk->thread.fpu;
@@ -361,8 +365,8 @@ void fpu__clear(struct task_struct *tsk)
if (!fpu->fpstate_active) {
fpu__activate_curr(fpu);
user_fpu_begin();
- restore_init_xstate();
}
+ restore_init_xstate();
}
}

--
2.1.0

2015-05-05 17:59:24

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 176/208] x86/alternatives, x86/fpu: Add 'alternatives_patched' debug flag and use it in xsave_state()

We'd like to use xsave_state() earlier, but its SYSTEM_BOOTING check
is too imprecise.

The real condition that xsave_state() would like to check is whether
alternative XSAVE instructions were patched into the kernel image
already.

Add such a (read-mostly) debug flag and use it in xsave_state().

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/alternative.h | 6 ++++++
arch/x86/include/asm/fpu/xstate.h | 2 +-
arch/x86/kernel/alternative.c | 5 +++++
3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index ba32af062f61..7bfc85bbb8ff 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -52,6 +52,12 @@ struct alt_instr {
u8 padlen; /* length of build-time padding */
} __packed;

+/*
+ * Debug flag that can be tested to see whether alternative
+ * instructions were patched in already:
+ */
+extern int alternatives_patched;
+
extern void alternative_instructions(void);
extern void apply_alternatives(struct alt_instr *start, struct alt_instr *end);

diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 31a002ad5aeb..ab2c507b58b6 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -119,7 +119,7 @@ static inline int xsave_state(struct xsave_struct *fx)
u32 hmask = mask >> 32;
int err = 0;

- WARN_ON(system_state == SYSTEM_BOOTING);
+ WARN_ON(!alternatives_patched);

/*
* If xsaves is enabled, xsaves replaces xsaveopt because
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index aef653193160..7fe097235376 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -21,6 +21,10 @@
#include <asm/io.h>
#include <asm/fixmap.h>

+int __read_mostly alternatives_patched;
+
+EXPORT_SYMBOL_GPL(alternatives_patched);
+
#define MAX_PATCH_LEN (255-1)

static int __initdata_or_module debug_alternative;
@@ -627,6 +631,7 @@ void __init alternative_instructions(void)
apply_paravirt(__parainstructions, __parainstructions_end);

restart_nmi();
+ alternatives_patched = 1;
}

/**
--
2.1.0

2015-05-05 18:12:25

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 177/208] x86/fpu: Synchronize the naming of drop_fpu() and fpu_reset_state()

drop_fpu() and fpu_reset_state() are similar in functionality
and in scope, yet this is not apparent from their names.

drop_fpu() deactivates FPU contents (both the fpregs and the fpstate),
but leaves register contents intact in the eager-FPU case, mostly as an
optimization. It disables fpregs in the lazy FPU case. The drop_fpu()
method can be used to destroy FPU state in an optimized way, when we
know that a new state will be loaded before user-space might see
any remains of the old FPU state:

- such as in sys_exit()'s exit_thread() where we know this task
won't execute any user-space instructions anymore and the
next context switch cleans up the FPU. The old FPU state
might still be around in the eagerfpu case but won't be
saved.

- in __restore_xstate_sig(), where we use drop_fpu() before
copying a new state into the fpstate and activating that one.
No user-pace instructions can execute between those steps.

- in sys_execve()'s fpu__clear(): there we use drop_fpu() in
the !eagerfpu case, where it's equivalent to a full reinit.

fpu_reset_state() is a stronger version of drop_fpu(): both in
the eagerfpu and the lazy-FPU case it guarantees that fpregs
are reinitialized to init state. This method is used in cases
where we need a full reset:

- handle_signal() uses fpu_reset_state() to reset the FPU state
to init before executing a user-space signal handler. While we
have already saved the original FPU state at this point, and
always restore the original state, the signal handling code
still has to do this reinit, because signals may interrupt
any user-space instruction, and the FPU might be in various
intermediate states (such as an unbalanced x87 stack) that is
not immediately usable for general C signal handler code.

- __restore_xstate_sig() uses fpu_reset_state() when the signal
frame has no FP context. Since the signal handler may have
modified the FPU state, it gets reset back to init state.

- in another branch __restore_xstate_sig() uses fpu_reset_state()
to handle a restoration error: when restore_user_xstate() fails
to restore FPU state and we might have inconsistent FPU data,
fpu_reset_state() is used to reset it back to a known good
state.

- __kernel_fpu_end() uses fpu_reset_state() in an error branch.
This is in a 'must not trigger' error branch, so on bug-free
kernels this never triggers.

- fpu__restore() uses fpu_reset_state() in an error path
as well: if the fpstate was set up with invalid FPU state
(via ptrace or via a signal handler), then it's reset back
to init state.

- likewise, the scheduler's switch_fpu_finish() uses it in a
restoration error path too.

Move both drop_fpu() and fpu_reset_state() to the fpu__*() namespace
and harmonize their naming with their function:

fpu__drop()
fpu__reset()

This clearly shows that both methods operate on the full state of the
FPU, just like fpu__restore().

Also add comments to explain what each function does.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/fpu/internal.h | 23 ++++++++++++++---------
arch/x86/kernel/fpu/core.c | 6 +++---
arch/x86/kernel/fpu/xstate.c | 6 +++---
arch/x86/kernel/process.c | 2 +-
arch/x86/kernel/signal.c | 2 +-
5 files changed, 22 insertions(+), 17 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 0f17cd4e4e58..31bfda818f30 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -382,11 +382,17 @@ static inline void fpregs_deactivate(struct fpu *fpu)
__fpregs_deactivate_hw();
}

-static inline void drop_fpu(struct fpu *fpu)
+/*
+ * Drops current FPU state: deactivates the fpregs and
+ * the fpstate. NOTE: it still leaves previous contents
+ * in the fpregs in the eager-FPU case.
+ *
+ * This function can be used in cases where we know that
+ * a state-restore is coming: either an explicit one,
+ * or a reschedule.
+ */
+static inline void fpu__drop(struct fpu *fpu)
{
- /*
- * Forget coprocessor state..
- */
preempt_disable();
fpu->counter = 0;

@@ -412,13 +418,12 @@ static inline void restore_init_xstate(void)
}

/*
- * Reset the FPU state in the eager case and drop it in the lazy case (later use
- * will reinit it).
+ * Reset the FPU state back to init state.
*/
-static inline void fpu_reset_state(struct fpu *fpu)
+static inline void fpu__reset(struct fpu *fpu)
{
if (!use_eager_fpu())
- drop_fpu(fpu);
+ fpu__drop(fpu);
else
restore_init_xstate();
}
@@ -516,7 +521,7 @@ static inline void switch_fpu_finish(struct fpu *new_fpu, fpu_switch_t fpu_switc
{
if (fpu_switch.preload) {
if (unlikely(restore_fpu_checking(new_fpu)))
- fpu_reset_state(new_fpu);
+ fpu__reset(new_fpu);
}
}

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 33c9a43b000e..11ec1b736172 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -116,7 +116,7 @@ void __kernel_fpu_end(void)

if (fpu->fpregs_active) {
if (WARN_ON(restore_fpu_checking(fpu)))
- fpu_reset_state(fpu);
+ fpu__reset(fpu);
} else {
__fpregs_deactivate_hw();
}
@@ -339,7 +339,7 @@ void fpu__restore(void)
kernel_fpu_disable();
fpregs_activate(fpu);
if (unlikely(restore_fpu_checking(fpu))) {
- fpu_reset_state(fpu);
+ fpu__reset(fpu);
force_sig_info(SIGSEGV, SEND_SIG_PRIV, tsk);
} else {
tsk->thread.fpu.counter++;
@@ -360,7 +360,7 @@ void fpu__clear(struct task_struct *tsk)

if (!use_eager_fpu()) {
/* FPU state will be reallocated lazily at the first use. */
- drop_fpu(fpu);
+ fpu__drop(fpu);
} else {
if (!fpu->fpstate_active) {
fpu__activate_curr(fpu);
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index b8e5fee2aef3..5e3d9242bb95 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -401,7 +401,7 @@ int __restore_xstate_sig(void __user *buf, void __user *buf_fx, int size)
config_enabled(CONFIG_IA32_EMULATION));

if (!buf) {
- fpu_reset_state(fpu);
+ fpu__reset(fpu);
return 0;
}

@@ -449,7 +449,7 @@ int __restore_xstate_sig(void __user *buf, void __user *buf_fx, int size)
* We will be ready to restore/save the state only after
* fpu->fpstate_active is again set.
*/
- drop_fpu(fpu);
+ fpu__drop(fpu);

if (__copy_from_user(&fpu->state.xsave, buf_fx, state_size) ||
__copy_from_user(&env, buf, sizeof(env))) {
@@ -474,7 +474,7 @@ int __restore_xstate_sig(void __user *buf, void __user *buf_fx, int size)
*/
user_fpu_begin();
if (restore_user_xstate(buf_fx, xfeatures, fx_only)) {
- fpu_reset_state(fpu);
+ fpu__reset(fpu);
return -1;
}
}
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 4b4b16c8e6ee..099e7a889ab9 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -110,7 +110,7 @@ void exit_thread(void)
kfree(bp);
}

- drop_fpu(fpu);
+ fpu__drop(fpu);
}

void flush_thread(void)
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 59cfc9c97491..6bf512390536 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -667,7 +667,7 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs)
* Ensure the signal handler starts with the new fpu state.
*/
if (fpu->fpstate_active)
- fpu_reset_state(fpu);
+ fpu__reset(fpu);
}
signal_setup_done(failed, ksig, stepping);
}
--
2.1.0

2015-05-05 18:09:40

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 178/208] x86/fpu: Rename restore_fpu_checking() to copy_fpstate_to_fpregs()

fpu_restore_checking() is a helper function of restore_fpu_checking(),
but this is not apparent from the naming.

Both copy fpstate contents to fpregs, while the fuller variant does
a full copy without leaking information.

So rename them to:

copy_fpstate_to_fpregs()
__copy_fpstate_to_fpregs()

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/fpu/internal.h | 8 ++++----
arch/x86/kernel/fpu/core.c | 4 ++--
arch/x86/kvm/x86.c | 2 +-
3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 31bfda818f30..c09aea145e09 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -289,7 +289,7 @@ static inline int copy_fpregs_to_fpstate(struct fpu *fpu)

extern void fpu__save(struct fpu *fpu);

-static inline int fpu_restore_checking(struct fpu *fpu)
+static inline int __copy_fpstate_to_fpregs(struct fpu *fpu)
{
if (use_xsave())
return fpu_xrstor_checking(&fpu->state.xsave);
@@ -299,7 +299,7 @@ static inline int fpu_restore_checking(struct fpu *fpu)
return frstor_checking(&fpu->state.fsave);
}

-static inline int restore_fpu_checking(struct fpu *fpu)
+static inline int copy_fpstate_to_fpregs(struct fpu *fpu)
{
/*
* AMD K7/K8 CPUs don't save/restore FDP/FIP/FOP unless an exception is
@@ -314,7 +314,7 @@ static inline int restore_fpu_checking(struct fpu *fpu)
: : [addr] "m" (fpu->fpregs_active));
}

- return fpu_restore_checking(fpu);
+ return __copy_fpstate_to_fpregs(fpu);
}

/*
@@ -520,7 +520,7 @@ switch_fpu_prepare(struct fpu *old_fpu, struct fpu *new_fpu, int cpu)
static inline void switch_fpu_finish(struct fpu *new_fpu, fpu_switch_t fpu_switch)
{
if (fpu_switch.preload) {
- if (unlikely(restore_fpu_checking(new_fpu)))
+ if (unlikely(copy_fpstate_to_fpregs(new_fpu)))
fpu__reset(new_fpu);
}
}
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 11ec1b736172..72e3f02db40d 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -115,7 +115,7 @@ void __kernel_fpu_end(void)
struct fpu *fpu = &current->thread.fpu;

if (fpu->fpregs_active) {
- if (WARN_ON(restore_fpu_checking(fpu)))
+ if (WARN_ON(copy_fpstate_to_fpregs(fpu)))
fpu__reset(fpu);
} else {
__fpregs_deactivate_hw();
@@ -338,7 +338,7 @@ void fpu__restore(void)
/* Avoid __kernel_fpu_begin() right after fpregs_activate() */
kernel_fpu_disable();
fpregs_activate(fpu);
- if (unlikely(restore_fpu_checking(fpu))) {
+ if (unlikely(copy_fpstate_to_fpregs(fpu))) {
fpu__reset(fpu);
force_sig_info(SIGSEGV, SEND_SIG_PRIV, tsk);
} else {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5c61aae277f9..f4438179398b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7030,7 +7030,7 @@ void kvm_load_guest_fpu(struct kvm_vcpu *vcpu)
kvm_put_guest_xcr0(vcpu);
vcpu->guest_fpu_loaded = 1;
__kernel_fpu_begin();
- fpu_restore_checking(&vcpu->arch.guest_fpu);
+ __copy_fpstate_to_fpregs(&vcpu->arch.guest_fpu);
trace_kvm_fpu(1);
}

--
2.1.0

2015-05-05 17:59:29

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 179/208] x86/fpu: Move all the fpu__*() high level methods closer to each other

The fpu__*() methods are closely related, but they are defined
in scattered places within the FPU code.

Concentrate them, and also uninline fpu__save(), fpu__drop()
and fpu__reset() to save about 5K of kernel text on 64-bit kernels:

text data bss dec filename
14113063 2575280 1634304 18322647 vmlinux.before
14108070 2575280 1634304 18317654 vmlinux.after

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/fpu/internal.h | 53 ++++++++++-------------------------------------------
arch/x86/kernel/fpu/core.c | 38 ++++++++++++++++++++++++++++++++++++++
2 files changed, 48 insertions(+), 43 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index c09aea145e09..f20a0030f6a1 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -46,10 +46,19 @@ extern void fpu__init_system(struct cpuinfo_x86 *c);

extern void fpu__activate_curr(struct fpu *fpu);
extern void fpstate_init(struct fpu *fpu);
-extern void fpu__clear(struct task_struct *tsk);

extern int dump_fpu(struct pt_regs *, struct user_i387_struct *);
+
+/*
+ * High level FPU state handling functions:
+ */
+extern void fpu__save(struct fpu *fpu);
extern void fpu__restore(void);
+extern void fpu__drop(struct fpu *fpu);
+extern int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu);
+extern void fpu__reset(struct fpu *fpu);
+extern void fpu__clear(struct task_struct *tsk);
+
extern void fpu__init_check_bugs(void);
extern void fpu__resume_cpu(void);

@@ -287,8 +296,6 @@ static inline int copy_fpregs_to_fpstate(struct fpu *fpu)
return 0;
}

-extern void fpu__save(struct fpu *fpu);
-
static inline int __copy_fpstate_to_fpregs(struct fpu *fpu)
{
if (use_xsave())
@@ -382,33 +389,6 @@ static inline void fpregs_deactivate(struct fpu *fpu)
__fpregs_deactivate_hw();
}

-/*
- * Drops current FPU state: deactivates the fpregs and
- * the fpstate. NOTE: it still leaves previous contents
- * in the fpregs in the eager-FPU case.
- *
- * This function can be used in cases where we know that
- * a state-restore is coming: either an explicit one,
- * or a reschedule.
- */
-static inline void fpu__drop(struct fpu *fpu)
-{
- preempt_disable();
- fpu->counter = 0;
-
- if (fpu->fpregs_active) {
- /* Ignore delayed exceptions from user space */
- asm volatile("1: fwait\n"
- "2:\n"
- _ASM_EXTABLE(1b, 2b));
- fpregs_deactivate(fpu);
- }
-
- fpu->fpstate_active = 0;
-
- preempt_enable();
-}
-
static inline void restore_init_xstate(void)
{
if (use_xsave())
@@ -418,17 +398,6 @@ static inline void restore_init_xstate(void)
}

/*
- * Reset the FPU state back to init state.
- */
-static inline void fpu__reset(struct fpu *fpu)
-{
- if (!use_eager_fpu())
- fpu__drop(fpu);
- else
- restore_init_xstate();
-}
-
-/*
* Definitions for the eXtended Control Register instructions
*/

@@ -597,8 +566,6 @@ static inline unsigned short get_fpu_mxcsr(struct task_struct *tsk)
}
}

-extern int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu);
-
static inline unsigned long
alloc_mathframe(unsigned long sp, int ia32_frame, unsigned long *buf_fx,
unsigned long *size)
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 72e3f02db40d..cba02f7e337b 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -349,6 +349,44 @@ void fpu__restore(void)
EXPORT_SYMBOL_GPL(fpu__restore);

/*
+ * Drops current FPU state: deactivates the fpregs and
+ * the fpstate. NOTE: it still leaves previous contents
+ * in the fpregs in the eager-FPU case.
+ *
+ * This function can be used in cases where we know that
+ * a state-restore is coming: either an explicit one,
+ * or a reschedule.
+ */
+void fpu__drop(struct fpu *fpu)
+{
+ preempt_disable();
+ fpu->counter = 0;
+
+ if (fpu->fpregs_active) {
+ /* Ignore delayed exceptions from user space */
+ asm volatile("1: fwait\n"
+ "2:\n"
+ _ASM_EXTABLE(1b, 2b));
+ fpregs_deactivate(fpu);
+ }
+
+ fpu->fpstate_active = 0;
+
+ preempt_enable();
+}
+
+/*
+ * Reset the FPU state back to init state:
+ */
+void fpu__reset(struct fpu *fpu)
+{
+ if (!use_eager_fpu())
+ fpu__drop(fpu);
+ else
+ restore_init_xstate();
+}
+
+/*
* Called by sys_execve() to clear the FPU fpregs, so that FPU state
* of the previous binary does not leak over into the exec()ed binary:
*/
--
2.1.0

2015-05-05 18:09:11

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 180/208] x86/fpu: Move fpu__clear() to 'struct fpu *' parameter passing

Do it like all other high level FPU state handling functions: they
only know about struct fpu, not about the task.

(Also remove a dead prototype while at it.)

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/fpu/internal.h | 5 ++---
arch/x86/kernel/fpu/core.c | 6 ++----
arch/x86/kernel/process.c | 2 +-
3 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index f20a0030f6a1..d6ac4611f05e 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -37,9 +37,8 @@ int ia32_setup_frame(int sig, struct ksignal *ksig,
#define MXCSR_DEFAULT 0x1f80

extern unsigned int mxcsr_feature_mask;
-extern void fpu__init_cpu(void);
-extern void eager_fpu_init(void);

+extern void fpu__init_cpu(void);
extern void fpu__init_system_xstate(void);
extern void fpu__init_cpu_xstate(void);
extern void fpu__init_system(struct cpuinfo_x86 *c);
@@ -57,7 +56,7 @@ extern void fpu__restore(void);
extern void fpu__drop(struct fpu *fpu);
extern int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu);
extern void fpu__reset(struct fpu *fpu);
-extern void fpu__clear(struct task_struct *tsk);
+extern void fpu__clear(struct fpu *fpu);

extern void fpu__init_check_bugs(void);
extern void fpu__resume_cpu(void);
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index cba02f7e337b..51afe4466ae3 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -390,11 +390,9 @@ void fpu__reset(struct fpu *fpu)
* Called by sys_execve() to clear the FPU fpregs, so that FPU state
* of the previous binary does not leak over into the exec()ed binary:
*/
-void fpu__clear(struct task_struct *tsk)
+void fpu__clear(struct fpu *fpu)
{
- struct fpu *fpu = &tsk->thread.fpu;
-
- WARN_ON_ONCE(tsk != current); /* Almost certainly an anomaly */
+ WARN_ON_ONCE(fpu != &current->thread.fpu); /* Almost certainly an anomaly */

if (!use_eager_fpu()) {
/* FPU state will be reallocated lazily at the first use. */
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 099e7a889ab9..a478e8ba16f7 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -120,7 +120,7 @@ void flush_thread(void)
flush_ptrace_hw_breakpoint(tsk);
memset(tsk->thread.tls_array, 0, sizeof(tsk->thread.tls_array));

- fpu__clear(tsk);
+ fpu__clear(&tsk->thread.fpu);
}

static void hard_disable_TSC(void)
--
2.1.0

2015-05-05 18:08:49

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 181/208] x86/fpu: Rename restore_xstate_sig() to fpu__restore_sig()

restore_xstate_sig() is a misnomer: it's not limited to 'xstate' at all,
it is the high level 'restore FPU state from a signal frame' function
that works with all legacy FPU formats as well.

Rename it (and its helper) accordingly, and also move it to the
fpu__*() namespace.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/ia32/ia32_signal.c | 2 +-
arch/x86/include/asm/fpu/internal.h | 6 +++---
arch/x86/kernel/fpu/xstate.c | 2 +-
arch/x86/kernel/signal.c | 2 +-
4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c
index 2e0b1b7842ae..de81ddf3b0b2 100644
--- a/arch/x86/ia32/ia32_signal.c
+++ b/arch/x86/ia32/ia32_signal.c
@@ -197,7 +197,7 @@ static int ia32_restore_sigcontext(struct pt_regs *regs,
buf = compat_ptr(tmp);
} get_user_catch(err);

- err |= restore_xstate_sig(buf, 1);
+ err |= fpu__restore_sig(buf, 1);

force_iret();

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index d6ac4611f05e..d6cfbdafbab2 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -497,14 +497,14 @@ static inline void switch_fpu_finish(struct fpu *new_fpu, fpu_switch_t fpu_switc
* Signal frame handlers...
*/
extern int copy_fpstate_to_sigframe(void __user *buf, void __user *fx, int size);
-extern int __restore_xstate_sig(void __user *buf, void __user *fx, int size);
+extern int __fpu__restore_sig(void __user *buf, void __user *fx, int size);

static inline int xstate_sigframe_size(void)
{
return use_xsave() ? xstate_size + FP_XSTATE_MAGIC2_SIZE : xstate_size;
}

-static inline int restore_xstate_sig(void __user *buf, int ia32_frame)
+static inline int fpu__restore_sig(void __user *buf, int ia32_frame)
{
void __user *buf_fx = buf;
int size = xstate_sigframe_size();
@@ -514,7 +514,7 @@ static inline int restore_xstate_sig(void __user *buf, int ia32_frame)
size += sizeof(struct i387_fsave_struct);
}

- return __restore_xstate_sig(buf, buf_fx, size);
+ return __fpu__restore_sig(buf, buf_fx, size);
}

/*
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 5e3d9242bb95..ea514d6a34e8 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -388,7 +388,7 @@ static inline int restore_user_xstate(void __user *buf, u64 xbv, int fx_only)
return frstor_user(buf);
}

-int __restore_xstate_sig(void __user *buf, void __user *buf_fx, int size)
+int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
{
int ia32_fxstate = (buf != buf_fx);
struct task_struct *tsk = current;
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 6bf512390536..7416fa86f3c7 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -102,7 +102,7 @@ int restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc)
get_user_ex(buf, &sc->fpstate);
} get_user_catch(err);

- err |= restore_xstate_sig(buf, config_enabled(CONFIG_X86_32));
+ err |= fpu__restore_sig(buf, config_enabled(CONFIG_X86_32));

force_iret();

--
2.1.0

2015-05-05 17:59:41

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 182/208] x86/fpu: Move the signal frame handling code closer to each other

Consolidate more signal frame related functions:

text data bss dec filename
14108070 2575280 1634304 18317654 vmlinux.before
14107944 2575344 1634304 18317592 vmlinux.after

Also, while moving it, rename alloc_mathframe() to fpu__alloc_mathframe().

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/ia32/ia32_signal.c | 2 +-
arch/x86/include/asm/fpu/internal.h | 38 ++++----------------------------------
arch/x86/kernel/fpu/xstate.c | 39 ++++++++++++++++++++++++++++++++++++++-
arch/x86/kernel/signal.c | 4 ++--
4 files changed, 45 insertions(+), 38 deletions(-)

diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c
index de81ddf3b0b2..54605eb1631f 100644
--- a/arch/x86/ia32/ia32_signal.c
+++ b/arch/x86/ia32/ia32_signal.c
@@ -325,7 +325,7 @@ static void __user *get_sigframe(struct ksignal *ksig, struct pt_regs *regs,
if (fpu->fpstate_active) {
unsigned long fx_aligned, math_size;

- sp = alloc_mathframe(sp, 1, &fx_aligned, &math_size);
+ sp = fpu__alloc_mathframe(sp, 1, &fx_aligned, &math_size);
*fpstate = (struct _fpstate_ia32 __user *) sp;
if (copy_fpstate_to_sigframe(*fpstate, (void __user *)fx_aligned,
math_size) < 0)
diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index d6cfbdafbab2..34fbf95bbe14 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -53,6 +53,7 @@ extern int dump_fpu(struct pt_regs *, struct user_i387_struct *);
*/
extern void fpu__save(struct fpu *fpu);
extern void fpu__restore(void);
+extern int fpu__restore_sig(void __user *buf, int ia32_frame);
extern void fpu__drop(struct fpu *fpu);
extern int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu);
extern void fpu__reset(struct fpu *fpu);
@@ -497,25 +498,6 @@ static inline void switch_fpu_finish(struct fpu *new_fpu, fpu_switch_t fpu_switc
* Signal frame handlers...
*/
extern int copy_fpstate_to_sigframe(void __user *buf, void __user *fx, int size);
-extern int __fpu__restore_sig(void __user *buf, void __user *fx, int size);
-
-static inline int xstate_sigframe_size(void)
-{
- return use_xsave() ? xstate_size + FP_XSTATE_MAGIC2_SIZE : xstate_size;
-}
-
-static inline int fpu__restore_sig(void __user *buf, int ia32_frame)
-{
- void __user *buf_fx = buf;
- int size = xstate_sigframe_size();
-
- if (ia32_frame && use_fxsr()) {
- buf_fx = buf + sizeof(struct i387_fsave_struct);
- size += sizeof(struct i387_fsave_struct);
- }
-
- return __fpu__restore_sig(buf, buf_fx, size);
-}

/*
* Needs to be preemption-safe.
@@ -565,20 +547,8 @@ static inline unsigned short get_fpu_mxcsr(struct task_struct *tsk)
}
}

-static inline unsigned long
-alloc_mathframe(unsigned long sp, int ia32_frame, unsigned long *buf_fx,
- unsigned long *size)
-{
- unsigned long frame_size = xstate_sigframe_size();
-
- *buf_fx = sp = round_down(sp - frame_size, 64);
- if (ia32_frame && use_fxsr()) {
- frame_size += sizeof(struct i387_fsave_struct);
- sp -= sizeof(struct i387_fsave_struct);
- }
-
- *size = frame_size;
- return sp;
-}
+unsigned long
+fpu__alloc_mathframe(unsigned long sp, int ia32_frame,
+ unsigned long *buf_fx, unsigned long *size);

#endif /* _ASM_X86_FPU_INTERNAL_H */
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index ea514d6a34e8..810f080fadf3 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -388,7 +388,7 @@ static inline int restore_user_xstate(void __user *buf, u64 xbv, int fx_only)
return frstor_user(buf);
}

-int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
+static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
{
int ia32_fxstate = (buf != buf_fx);
struct task_struct *tsk = current;
@@ -482,6 +482,43 @@ int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
return 0;
}

+static inline int xstate_sigframe_size(void)
+{
+ return use_xsave() ? xstate_size + FP_XSTATE_MAGIC2_SIZE : xstate_size;
+}
+
+/*
+ * Restore FPU state from a sigframe:
+ */
+int fpu__restore_sig(void __user *buf, int ia32_frame)
+{
+ void __user *buf_fx = buf;
+ int size = xstate_sigframe_size();
+
+ if (ia32_frame && use_fxsr()) {
+ buf_fx = buf + sizeof(struct i387_fsave_struct);
+ size += sizeof(struct i387_fsave_struct);
+ }
+
+ return __fpu__restore_sig(buf, buf_fx, size);
+}
+
+unsigned long
+fpu__alloc_mathframe(unsigned long sp, int ia32_frame,
+ unsigned long *buf_fx, unsigned long *size)
+{
+ unsigned long frame_size = xstate_sigframe_size();
+
+ *buf_fx = sp = round_down(sp - frame_size, 64);
+ if (ia32_frame && use_fxsr()) {
+ frame_size += sizeof(struct i387_fsave_struct);
+ sp -= sizeof(struct i387_fsave_struct);
+ }
+
+ *size = frame_size;
+
+ return sp;
+}
/*
* Prepare the SW reserved portion of the fxsave memory layout, indicating
* the presence of the extended state information in the memory layout
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 7416fa86f3c7..9554ca69a84e 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -219,8 +219,8 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
}

if (fpu->fpstate_active) {
- sp = alloc_mathframe(sp, config_enabled(CONFIG_X86_32),
- &buf_fx, &math_size);
+ sp = fpu__alloc_mathframe(sp, config_enabled(CONFIG_X86_32),
+ &buf_fx, &math_size);
*fpstate = (void __user *)sp;
}

--
2.1.0

2015-05-05 17:59:36

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 183/208] x86/fpu: Merge fpu__reset() and fpu__clear()

With recent cleanups and fixes the fpu__reset() and fpu__clear()
functions have become almost identical in functionality: the only
difference is that fpu__reset() assumed that the fpstate
was already active in the eagerfpu case, while fpu__clear()
activated it if it was inactive.

This distinction almost never matters, the only case where such
fpstate activation happens if if the init thread (PID 1) gets exec()-ed
for the first time.

So keep fpu__clear() and change all fpu__reset() uses to
fpu__clear() to simpify the logic.

( In a later patch we'll further simplify fpu__clear() by making
sure that all contexts it is called on are already active. )

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/fpu/internal.h | 3 +--
arch/x86/kernel/fpu/core.c | 21 ++++++---------------
arch/x86/kernel/fpu/xstate.c | 4 ++--
arch/x86/kernel/signal.c | 2 +-
4 files changed, 10 insertions(+), 20 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 34fbf95bbe14..a55d63efab0f 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -56,7 +56,6 @@ extern void fpu__restore(void);
extern int fpu__restore_sig(void __user *buf, int ia32_frame);
extern void fpu__drop(struct fpu *fpu);
extern int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu);
-extern void fpu__reset(struct fpu *fpu);
extern void fpu__clear(struct fpu *fpu);

extern void fpu__init_check_bugs(void);
@@ -490,7 +489,7 @@ static inline void switch_fpu_finish(struct fpu *new_fpu, fpu_switch_t fpu_switc
{
if (fpu_switch.preload) {
if (unlikely(copy_fpstate_to_fpregs(new_fpu)))
- fpu__reset(new_fpu);
+ fpu__clear(new_fpu);
}
}

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 51afe4466ae3..c8eb76a41ce8 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -116,7 +116,7 @@ void __kernel_fpu_end(void)

if (fpu->fpregs_active) {
if (WARN_ON(copy_fpstate_to_fpregs(fpu)))
- fpu__reset(fpu);
+ fpu__clear(fpu);
} else {
__fpregs_deactivate_hw();
}
@@ -339,7 +339,7 @@ void fpu__restore(void)
kernel_fpu_disable();
fpregs_activate(fpu);
if (unlikely(copy_fpstate_to_fpregs(fpu))) {
- fpu__reset(fpu);
+ fpu__clear(fpu);
force_sig_info(SIGSEGV, SEND_SIG_PRIV, tsk);
} else {
tsk->thread.fpu.counter++;
@@ -376,19 +376,10 @@ void fpu__drop(struct fpu *fpu)
}

/*
- * Reset the FPU state back to init state:
- */
-void fpu__reset(struct fpu *fpu)
-{
- if (!use_eager_fpu())
- fpu__drop(fpu);
- else
- restore_init_xstate();
-}
-
-/*
- * Called by sys_execve() to clear the FPU fpregs, so that FPU state
- * of the previous binary does not leak over into the exec()ed binary:
+ * Clear the FPU state back to init state.
+ *
+ * Called by sys_execve(), by the signal handler code and by various
+ * error paths.
*/
void fpu__clear(struct fpu *fpu)
{
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 810f080fadf3..9bc3734acc16 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -401,7 +401,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
config_enabled(CONFIG_IA32_EMULATION));

if (!buf) {
- fpu__reset(fpu);
+ fpu__clear(fpu);
return 0;
}

@@ -474,7 +474,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
*/
user_fpu_begin();
if (restore_user_xstate(buf_fx, xfeatures, fx_only)) {
- fpu__reset(fpu);
+ fpu__clear(fpu);
return -1;
}
}
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 9554ca69a84e..7c08795073d2 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -667,7 +667,7 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs)
* Ensure the signal handler starts with the new fpu state.
*/
if (fpu->fpstate_active)
- fpu__reset(fpu);
+ fpu__clear(fpu);
}
signal_setup_done(failed, ksig, stepping);
}
--
2.1.0

2015-05-05 18:08:29

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 184/208] x86/fpu: Move is_ia32*frame() helpers out of fpu/internal.h

Move them to their only user. This makes the code easier to read,
the header is less cluttered, and it also speeds up the build a bit.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/fpu/internal.h | 16 ----------------
arch/x86/kernel/signal.c | 16 ++++++++++++++++
2 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index a55d63efab0f..dc4842b0831b 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -103,22 +103,6 @@ static inline int fpu_want_lazy_restore(struct fpu *fpu, unsigned int cpu)
return fpu == this_cpu_read_stable(fpu_fpregs_owner_ctx) && cpu == fpu->last_cpu;
}

-static inline int is_ia32_compat_frame(void)
-{
- return config_enabled(CONFIG_IA32_EMULATION) &&
- test_thread_flag(TIF_IA32);
-}
-
-static inline int is_ia32_frame(void)
-{
- return config_enabled(CONFIG_X86_32) || is_ia32_compat_frame();
-}
-
-static inline int is_x32_frame(void)
-{
- return config_enabled(CONFIG_X86_X32_ABI) && test_thread_flag(TIF_X32);
-}
-
#define X87_FSW_ES (1 << 7) /* Exception Summary */

static __always_inline __pure bool use_eager_fpu(void)
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 7c08795073d2..f4b205686527 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -593,6 +593,22 @@ asmlinkage long sys_rt_sigreturn(void)
return 0;
}

+static inline int is_ia32_compat_frame(void)
+{
+ return config_enabled(CONFIG_IA32_EMULATION) &&
+ test_thread_flag(TIF_IA32);
+}
+
+static inline int is_ia32_frame(void)
+{
+ return config_enabled(CONFIG_X86_32) || is_ia32_compat_frame();
+}
+
+static inline int is_x32_frame(void)
+{
+ return config_enabled(CONFIG_X86_X32_ABI) && test_thread_flag(TIF_X32);
+}
+
static int
setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs)
{
--
2.1.0

2015-05-05 18:07:31

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 185/208] x86/fpu: Split out fpu/signal.h from fpu/internal.h for signal frame handling functions

Most of the FPU does not use them, so split it out and include
them in signal.c and ia32_signal.c

Also fix header file dependency assumption in fpu/core.c.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/ia32/ia32_signal.c | 1 +
arch/x86/include/asm/fpu/internal.h | 24 ------------------------
arch/x86/include/asm/fpu/signal.h | 31 +++++++++++++++++++++++++++++++
arch/x86/kernel/fpu/core.c | 2 ++
arch/x86/kernel/fpu/xstate.c | 1 +
arch/x86/kernel/ptrace.c | 1 +
arch/x86/kernel/signal.c | 1 +
7 files changed, 37 insertions(+), 24 deletions(-)

diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c
index 54605eb1631f..ae3a29ae875b 100644
--- a/arch/x86/ia32/ia32_signal.c
+++ b/arch/x86/ia32/ia32_signal.c
@@ -22,6 +22,7 @@
#include <asm/ucontext.h>
#include <asm/uaccess.h>
#include <asm/fpu/internal.h>
+#include <asm/fpu/signal.h>
#include <asm/ptrace.h>
#include <asm/ia32_unistd.h>
#include <asm/user32.h>
diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index dc4842b0831b..e2ceb49d310d 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -19,21 +19,6 @@
#include <asm/fpu/api.h>
#include <asm/fpu/xstate.h>

-#ifdef CONFIG_X86_64
-# include <asm/sigcontext32.h>
-# include <asm/user32.h>
-struct ksignal;
-int ia32_setup_rt_frame(int sig, struct ksignal *ksig,
- compat_sigset_t *set, struct pt_regs *regs);
-int ia32_setup_frame(int sig, struct ksignal *ksig,
- compat_sigset_t *set, struct pt_regs *regs);
-#else
-# define user_i387_ia32_struct user_i387_struct
-# define user32_fxsr_struct user_fxsr_struct
-# define ia32_setup_frame __setup_frame
-# define ia32_setup_rt_frame __setup_rt_frame
-#endif
-
#define MXCSR_DEFAULT 0x1f80

extern unsigned int mxcsr_feature_mask;
@@ -63,11 +48,6 @@ extern void fpu__resume_cpu(void);

DECLARE_PER_CPU(struct fpu *, fpu_fpregs_owner_ctx);

-extern void convert_from_fxsr(struct user_i387_ia32_struct *env,
- struct task_struct *tsk);
-extern void convert_to_fxsr(struct task_struct *tsk,
- const struct user_i387_ia32_struct *env);
-
extern user_regset_active_fn regset_fpregs_active, regset_xregset_fpregs_active;
extern user_regset_get_fn fpregs_get, xfpregs_get, fpregs_soft_get,
xstateregs_get;
@@ -530,8 +510,4 @@ static inline unsigned short get_fpu_mxcsr(struct task_struct *tsk)
}
}

-unsigned long
-fpu__alloc_mathframe(unsigned long sp, int ia32_frame,
- unsigned long *buf_fx, unsigned long *size);
-
#endif /* _ASM_X86_FPU_INTERNAL_H */
diff --git a/arch/x86/include/asm/fpu/signal.h b/arch/x86/include/asm/fpu/signal.h
new file mode 100644
index 000000000000..0803dc2aba80
--- /dev/null
+++ b/arch/x86/include/asm/fpu/signal.h
@@ -0,0 +1,31 @@
+/*
+ * x86 FPU signal frame handling methods:
+ */
+#ifndef _ASM_X86_FPU_SIGNAL_H
+#define _ASM_X86_FPU_SIGNAL_H
+
+#ifdef CONFIG_X86_64
+# include <asm/sigcontext32.h>
+# include <asm/user32.h>
+struct ksignal;
+int ia32_setup_rt_frame(int sig, struct ksignal *ksig,
+ compat_sigset_t *set, struct pt_regs *regs);
+int ia32_setup_frame(int sig, struct ksignal *ksig,
+ compat_sigset_t *set, struct pt_regs *regs);
+#else
+# define user_i387_ia32_struct user_i387_struct
+# define user32_fxsr_struct user_fxsr_struct
+# define ia32_setup_frame __setup_frame
+# define ia32_setup_rt_frame __setup_rt_frame
+#endif
+
+extern void convert_from_fxsr(struct user_i387_ia32_struct *env,
+ struct task_struct *tsk);
+extern void convert_to_fxsr(struct task_struct *tsk,
+ const struct user_i387_ia32_struct *env);
+
+unsigned long
+fpu__alloc_mathframe(unsigned long sp, int ia32_frame,
+ unsigned long *buf_fx, unsigned long *size);
+
+#endif /* _ASM_X86_FPU_SIGNAL_H */
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index c8eb76a41ce8..45a2ca26a36f 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -6,6 +6,8 @@
* Gareth Hughes <[email protected]>, May 2000
*/
#include <asm/fpu/internal.h>
+#include <asm/fpu/signal.h>
+
#include <linux/hardirq.h>

/*
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 9bc3734acc16..78710740e9a0 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -7,6 +7,7 @@
#include <linux/cpu.h>
#include <asm/fpu/api.h>
#include <asm/fpu/internal.h>
+#include <asm/fpu/signal.h>
#include <asm/sigframe.h>
#include <asm/tlbflush.h>

diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 4c615661ec72..51e73a685ce4 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -29,6 +29,7 @@
#include <asm/pgtable.h>
#include <asm/processor.h>
#include <asm/fpu/internal.h>
+#include <asm/fpu/signal.h>
#include <asm/debugreg.h>
#include <asm/ldt.h>
#include <asm/desc.h>
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index f4b205686527..206996c1669d 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -27,6 +27,7 @@
#include <asm/processor.h>
#include <asm/ucontext.h>
#include <asm/fpu/internal.h>
+#include <asm/fpu/signal.h>
#include <asm/vdso.h>
#include <asm/mce.h>
#include <asm/sighandling.h>
--
2.1.0

2015-05-05 18:07:28

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 186/208] x86/fpu: Factor out fpu/regset.h from fpu/internal.h

Only a few places use the regset definitions, so factor them out.

Also fix related header dependency assumptions.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/fpu/internal.h | 13 -------------
arch/x86/include/asm/fpu/regset.h | 21 +++++++++++++++++++++
arch/x86/include/asm/fpu/xstate.h | 1 +
arch/x86/kernel/fpu/core.c | 1 +
arch/x86/kernel/fpu/xstate.c | 2 ++
arch/x86/kernel/ptrace.c | 2 +-
6 files changed, 26 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index e2ceb49d310d..7b62d9032623 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -10,7 +10,6 @@
#ifndef _ASM_X86_FPU_INTERNAL_H
#define _ASM_X86_FPU_INTERNAL_H

-#include <linux/regset.h>
#include <linux/compat.h>
#include <linux/sched.h>
#include <linux/slab.h>
@@ -48,18 +47,6 @@ extern void fpu__resume_cpu(void);

DECLARE_PER_CPU(struct fpu *, fpu_fpregs_owner_ctx);

-extern user_regset_active_fn regset_fpregs_active, regset_xregset_fpregs_active;
-extern user_regset_get_fn fpregs_get, xfpregs_get, fpregs_soft_get,
- xstateregs_get;
-extern user_regset_set_fn fpregs_set, xfpregs_set, fpregs_soft_set,
- xstateregs_set;
-
-/*
- * xstateregs_active == regset_fpregs_active. Please refer to the comment
- * at the definition of regset_fpregs_active.
- */
-#define xstateregs_active regset_fpregs_active
-
#ifdef CONFIG_MATH_EMULATION
extern void finit_soft_fpu(struct i387_soft_struct *soft);
#else
diff --git a/arch/x86/include/asm/fpu/regset.h b/arch/x86/include/asm/fpu/regset.h
new file mode 100644
index 000000000000..39d3107ac6c7
--- /dev/null
+++ b/arch/x86/include/asm/fpu/regset.h
@@ -0,0 +1,21 @@
+/*
+ * FPU regset handling methods:
+ */
+#ifndef _ASM_X86_FPU_REGSET_H
+#define _ASM_X86_FPU_REGSET_H
+
+#include <linux/regset.h>
+
+extern user_regset_active_fn regset_fpregs_active, regset_xregset_fpregs_active;
+extern user_regset_get_fn fpregs_get, xfpregs_get, fpregs_soft_get,
+ xstateregs_get;
+extern user_regset_set_fn fpregs_set, xfpregs_set, fpregs_soft_set,
+ xstateregs_set;
+
+/*
+ * xstateregs_active == regset_fpregs_active. Please refer to the comment
+ * at the definition of regset_fpregs_active.
+ */
+#define xstateregs_active regset_fpregs_active
+
+#endif /* _ASM_X86_FPU_REGSET_H */
diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index ab2c507b58b6..afd21329c585 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -3,6 +3,7 @@

#include <linux/types.h>
#include <asm/processor.h>
+#include <linux/uaccess.h>

/* Bit 63 of XCR0 is reserved for future expansion */
#define XSTATE_EXTEND_MASK (~(XSTATE_FPSSE | (1ULL << 63)))
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 45a2ca26a36f..b34c1a72d838 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -6,6 +6,7 @@
* Gareth Hughes <[email protected]>, May 2000
*/
#include <asm/fpu/internal.h>
+#include <asm/fpu/regset.h>
#include <asm/fpu/signal.h>

#include <linux/hardirq.h>
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 78710740e9a0..59bd35a57afc 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -5,9 +5,11 @@
*/
#include <linux/compat.h>
#include <linux/cpu.h>
+
#include <asm/fpu/api.h>
#include <asm/fpu/internal.h>
#include <asm/fpu/signal.h>
+#include <asm/fpu/regset.h>
#include <asm/sigframe.h>
#include <asm/tlbflush.h>

diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 51e73a685ce4..9be72bc3613f 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -11,7 +11,6 @@
#include <linux/errno.h>
#include <linux/slab.h>
#include <linux/ptrace.h>
-#include <linux/regset.h>
#include <linux/tracehook.h>
#include <linux/user.h>
#include <linux/elf.h>
@@ -30,6 +29,7 @@
#include <asm/processor.h>
#include <asm/fpu/internal.h>
#include <asm/fpu/signal.h>
+#include <asm/fpu/regset.h>
#include <asm/debugreg.h>
#include <asm/ldt.h>
#include <asm/desc.h>
--
2.1.0

2015-05-05 18:06:56

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 187/208] x86/fpu: Remove run-once init quirks

Remove various boot quirks that came from the old code.

The new code is cleanly split up into per-system and per-cpu
init sequences, and system init functions are only called once.

Remove the run-once quirks.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/fpu/init.c | 6 ------
arch/x86/kernel/fpu/xstate.c | 11 -----------
2 files changed, 17 deletions(-)

diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index 72219ce2385a..7b6265df6082 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -143,12 +143,6 @@ EXPORT_SYMBOL_GPL(xstate_size);
*/
static void fpu__init_system_xstate_size_legacy(void)
{
- static bool on_boot_cpu = 1;
-
- if (!on_boot_cpu)
- return;
- on_boot_cpu = 0;
-
/*
* Note that xstate_size might be overwriten later during
* fpu__init_system_xstate().
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 59bd35a57afc..8285d4b40763 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -658,12 +658,6 @@ void setup_xstate_comp(void)
*/
static void setup_init_fpu_buf(void)
{
- static int on_boot_cpu = 1;
-
- if (!on_boot_cpu)
- return;
- on_boot_cpu = 0;
-
if (!cpu_has_xsave)
return;

@@ -719,11 +713,6 @@ static void __init init_xstate_size(void)
void fpu__init_system_xstate(void)
{
unsigned int eax, ebx, ecx, edx;
- static bool on_boot_cpu = 1;
-
- if (!on_boot_cpu)
- return;
- on_boot_cpu = 0;

if (!cpu_has_xsave) {
pr_info("x86/fpu: Legacy x87 FPU detected.\n");
--
2.1.0

2015-05-05 18:06:03

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 188/208] x86/fpu: Factor out the exception error code handling code

Factor out the FPU error code handling code from traps.c and fpu/internal.h
and move them close to each other.

Also convert the helper functions to 'struct fpu *', which further simplifies
them.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/fpu/internal.h | 33 ++--------------------
arch/x86/kernel/fpu/core.c | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
arch/x86/kernel/traps.c | 67 ++++++++-----------------------------------
3 files changed, 102 insertions(+), 86 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 7b62d9032623..dfdafea6e56f 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -30,7 +30,8 @@ extern void fpu__init_system(struct cpuinfo_x86 *c);
extern void fpu__activate_curr(struct fpu *fpu);
extern void fpstate_init(struct fpu *fpu);

-extern int dump_fpu(struct pt_regs *, struct user_i387_struct *);
+extern int dump_fpu(struct pt_regs *, struct user_i387_struct *);
+extern int fpu__exception_code(struct fpu *fpu, int trap_nr);

/*
* High level FPU state handling functions:
@@ -467,34 +468,4 @@ static inline void user_fpu_begin(void)
preempt_enable();
}

-/*
- * i387 state interaction
- */
-static inline unsigned short get_fpu_cwd(struct task_struct *tsk)
-{
- if (cpu_has_fxsr) {
- return tsk->thread.fpu.state.fxsave.cwd;
- } else {
- return (unsigned short)tsk->thread.fpu.state.fsave.cwd;
- }
-}
-
-static inline unsigned short get_fpu_swd(struct task_struct *tsk)
-{
- if (cpu_has_fxsr) {
- return tsk->thread.fpu.state.fxsave.swd;
- } else {
- return (unsigned short)tsk->thread.fpu.state.fsave.swd;
- }
-}
-
-static inline unsigned short get_fpu_mxcsr(struct task_struct *tsk)
-{
- if (cpu_has_xmm) {
- return tsk->thread.fpu.state.fxsave.mxcsr;
- } else {
- return MXCSR_DEFAULT;
- }
-}
-
#endif /* _ASM_X86_FPU_INTERNAL_H */
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index b34c1a72d838..38274909e193 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -8,6 +8,7 @@
#include <asm/fpu/internal.h>
#include <asm/fpu/regset.h>
#include <asm/fpu/signal.h>
+#include <asm/traps.h>

#include <linux/hardirq.h>

@@ -749,3 +750,90 @@ int dump_fpu(struct pt_regs *regs, struct user_i387_struct *ufpu)
EXPORT_SYMBOL(dump_fpu);

#endif /* CONFIG_X86_32 || CONFIG_IA32_EMULATION */
+
+/*
+ * x87 math exception handling:
+ */
+
+static inline unsigned short get_fpu_cwd(struct fpu *fpu)
+{
+ if (cpu_has_fxsr) {
+ return fpu->state.fxsave.cwd;
+ } else {
+ return (unsigned short)fpu->state.fsave.cwd;
+ }
+}
+
+static inline unsigned short get_fpu_swd(struct fpu *fpu)
+{
+ if (cpu_has_fxsr) {
+ return fpu->state.fxsave.swd;
+ } else {
+ return (unsigned short)fpu->state.fsave.swd;
+ }
+}
+
+static inline unsigned short get_fpu_mxcsr(struct fpu *fpu)
+{
+ if (cpu_has_xmm) {
+ return fpu->state.fxsave.mxcsr;
+ } else {
+ return MXCSR_DEFAULT;
+ }
+}
+
+int fpu__exception_code(struct fpu *fpu, int trap_nr)
+{
+ int err;
+
+ if (trap_nr == X86_TRAP_MF) {
+ unsigned short cwd, swd;
+ /*
+ * (~cwd & swd) will mask out exceptions that are not set to unmasked
+ * status. 0x3f is the exception bits in these regs, 0x200 is the
+ * C1 reg you need in case of a stack fault, 0x040 is the stack
+ * fault bit. We should only be taking one exception at a time,
+ * so if this combination doesn't produce any single exception,
+ * then we have a bad program that isn't synchronizing its FPU usage
+ * and it will suffer the consequences since we won't be able to
+ * fully reproduce the context of the exception
+ */
+ cwd = get_fpu_cwd(fpu);
+ swd = get_fpu_swd(fpu);
+
+ err = swd & ~cwd;
+ } else {
+ /*
+ * The SIMD FPU exceptions are handled a little differently, as there
+ * is only a single status/control register. Thus, to determine which
+ * unmasked exception was caught we must mask the exception mask bits
+ * at 0x1f80, and then use these to mask the exception bits at 0x3f.
+ */
+ unsigned short mxcsr = get_fpu_mxcsr(fpu);
+ err = ~(mxcsr >> 7) & mxcsr;
+ }
+
+ if (err & 0x001) { /* Invalid op */
+ /*
+ * swd & 0x240 == 0x040: Stack Underflow
+ * swd & 0x240 == 0x240: Stack Overflow
+ * User must clear the SF bit (0x40) if set
+ */
+ return FPE_FLTINV;
+ } else if (err & 0x004) { /* Divide by Zero */
+ return FPE_FLTDIV;
+ } else if (err & 0x008) { /* Overflow */
+ return FPE_FLTOVF;
+ } else if (err & 0x012) { /* Denormal, Underflow */
+ return FPE_FLTUND;
+ } else if (err & 0x020) { /* Precision */
+ return FPE_FLTRES;
+ }
+
+ /*
+ * If we're using IRQ 13, or supposedly even some trap
+ * X86_TRAP_MF implementations, it's possible
+ * we get a spurious trap, which is not an error.
+ */
+ return 0;
+}
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 48dfcd9ed351..cab397d0085f 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -708,8 +708,8 @@ NOKPROBE_SYMBOL(do_debug);
static void math_error(struct pt_regs *regs, int error_code, int trapnr)
{
struct task_struct *task = current;
+ struct fpu *fpu = &task->thread.fpu;
siginfo_t info;
- unsigned short err;
char *str = (trapnr == X86_TRAP_MF) ? "fpu exception" :
"simd exception";

@@ -717,8 +717,7 @@ static void math_error(struct pt_regs *regs, int error_code, int trapnr)
return;
conditional_sti(regs);

- if (!user_mode(regs))
- {
+ if (!user_mode(regs)) {
if (!fixup_exception(regs)) {
task->thread.error_code = error_code;
task->thread.trap_nr = trapnr;
@@ -730,62 +729,20 @@ static void math_error(struct pt_regs *regs, int error_code, int trapnr)
/*
* Save the info for the exception handler and clear the error.
*/
- fpu__save(&task->thread.fpu);
- task->thread.trap_nr = trapnr;
+ fpu__save(fpu);
+
+ task->thread.trap_nr = trapnr;
task->thread.error_code = error_code;
- info.si_signo = SIGFPE;
- info.si_errno = 0;
- info.si_addr = (void __user *)uprobe_get_trap_addr(regs);
- if (trapnr == X86_TRAP_MF) {
- unsigned short cwd, swd;
- /*
- * (~cwd & swd) will mask out exceptions that are not set to unmasked
- * status. 0x3f is the exception bits in these regs, 0x200 is the
- * C1 reg you need in case of a stack fault, 0x040 is the stack
- * fault bit. We should only be taking one exception at a time,
- * so if this combination doesn't produce any single exception,
- * then we have a bad program that isn't synchronizing its FPU usage
- * and it will suffer the consequences since we won't be able to
- * fully reproduce the context of the exception
- */
- cwd = get_fpu_cwd(task);
- swd = get_fpu_swd(task);
+ info.si_signo = SIGFPE;
+ info.si_errno = 0;
+ info.si_addr = (void __user *)uprobe_get_trap_addr(regs);

- err = swd & ~cwd;
- } else {
- /*
- * The SIMD FPU exceptions are handled a little differently, as there
- * is only a single status/control register. Thus, to determine which
- * unmasked exception was caught we must mask the exception mask bits
- * at 0x1f80, and then use these to mask the exception bits at 0x3f.
- */
- unsigned short mxcsr = get_fpu_mxcsr(task);
- err = ~(mxcsr >> 7) & mxcsr;
- }
+ info.si_code = fpu__exception_code(fpu, trapnr);

- if (err & 0x001) { /* Invalid op */
- /*
- * swd & 0x240 == 0x040: Stack Underflow
- * swd & 0x240 == 0x240: Stack Overflow
- * User must clear the SF bit (0x40) if set
- */
- info.si_code = FPE_FLTINV;
- } else if (err & 0x004) { /* Divide by Zero */
- info.si_code = FPE_FLTDIV;
- } else if (err & 0x008) { /* Overflow */
- info.si_code = FPE_FLTOVF;
- } else if (err & 0x012) { /* Denormal, Underflow */
- info.si_code = FPE_FLTUND;
- } else if (err & 0x020) { /* Precision */
- info.si_code = FPE_FLTRES;
- } else {
- /*
- * If we're using IRQ 13, or supposedly even some trap
- * X86_TRAP_MF implementations, it's possible
- * we get a spurious trap, which is not an error.
- */
+ /* Retry when we get spurious exceptions: */
+ if (!info.si_code)
return;
- }
+
force_sig_info(SIGFPE, &info, task);
}

--
2.1.0

2015-05-05 17:59:48

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 189/208] x86/fpu: Harmonize the names of the fpstate_init() helper functions

Harmonize the inconsistent naming of these related functions:

fpstate_init()
finit_soft_fpu() => fpstate_init_fsoft()
fx_finit() => fpstate_init_fxstate()
fx_finit() => fpstate_init_fstate() # split out

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/fpu/internal.h | 23 +++++++++++------------
arch/x86/kernel/fpu/core.c | 26 ++++++++++++++++----------
arch/x86/kernel/fpu/init.c | 2 +-
arch/x86/math-emu/fpu_aux.c | 4 ++--
4 files changed, 30 insertions(+), 25 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index dfdafea6e56f..0236ae6ffc26 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -28,7 +28,18 @@ extern void fpu__init_cpu_xstate(void);
extern void fpu__init_system(struct cpuinfo_x86 *c);

extern void fpu__activate_curr(struct fpu *fpu);
+
extern void fpstate_init(struct fpu *fpu);
+#ifdef CONFIG_MATH_EMULATION
+extern void fpstate_init_soft(struct i387_soft_struct *soft);
+#else
+static inline void fpstate_init_soft(struct i387_soft_struct *soft) {}
+#endif
+static inline void fpstate_init_fxstate(struct i387_fxsave_struct *fx)
+{
+ fx->cwd = 0x37f;
+ fx->mxcsr = MXCSR_DEFAULT;
+}

extern int dump_fpu(struct pt_regs *, struct user_i387_struct *);
extern int fpu__exception_code(struct fpu *fpu, int trap_nr);
@@ -48,12 +59,6 @@ extern void fpu__resume_cpu(void);

DECLARE_PER_CPU(struct fpu *, fpu_fpregs_owner_ctx);

-#ifdef CONFIG_MATH_EMULATION
-extern void finit_soft_fpu(struct i387_soft_struct *soft);
-#else
-static inline void finit_soft_fpu(struct i387_soft_struct *soft) {}
-#endif
-
/*
* Must be run with preemption disabled: this clears the fpu_fpregs_owner_ctx,
* on this CPU.
@@ -93,12 +98,6 @@ static __always_inline __pure bool use_fxsr(void)
return static_cpu_has_safe(X86_FEATURE_FXSR);
}

-static inline void fx_finit(struct i387_fxsave_struct *fx)
-{
- fx->cwd = 0x37f;
- fx->mxcsr = MXCSR_DEFAULT;
-}
-
extern void fpstate_sanitize_xstate(struct fpu *fpu);

#define user_insn(insn, output, input...) \
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 38274909e193..8eeba60bf78b 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -191,24 +191,30 @@ void fpu__save(struct fpu *fpu)
}
EXPORT_SYMBOL_GPL(fpu__save);

+/*
+ * Legacy x87 fpstate state init:
+ */
+static inline void fpstate_init_fstate(struct i387_fsave_struct *fp)
+{
+ fp->cwd = 0xffff037fu;
+ fp->swd = 0xffff0000u;
+ fp->twd = 0xffffffffu;
+ fp->fos = 0xffff0000u;
+}
+
void fpstate_init(struct fpu *fpu)
{
if (!cpu_has_fpu) {
- finit_soft_fpu(&fpu->state.soft);
+ fpstate_init_soft(&fpu->state.soft);
return;
}

memset(&fpu->state, 0, xstate_size);

- if (cpu_has_fxsr) {
- fx_finit(&fpu->state.fxsave);
- } else {
- struct i387_fsave_struct *fp = &fpu->state.fsave;
- fp->cwd = 0xffff037fu;
- fp->swd = 0xffff0000u;
- fp->twd = 0xffffffffu;
- fp->fos = 0xffff0000u;
- }
+ if (cpu_has_fxsr)
+ fpstate_init_fxstate(&fpu->state.fxsave);
+ else
+ fpstate_init_fstate(&fpu->state.fsave);
}
EXPORT_SYMBOL_GPL(fpstate_init);

diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index 7b6265df6082..5a7e57078935 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -121,7 +121,7 @@ static void fpu__init_system_generic(void)
* Set up the legacy init FPU context. (xstate init might overwrite this
* with a more modern format, if the CPU supports it.)
*/
- fx_finit(&init_xstate_ctx.i387);
+ fpstate_init_fxstate(&init_xstate_ctx.i387);

fpu__init_system_mxcsr();
}
diff --git a/arch/x86/math-emu/fpu_aux.c b/arch/x86/math-emu/fpu_aux.c
index 7562341ce299..768b2b8271d6 100644
--- a/arch/x86/math-emu/fpu_aux.c
+++ b/arch/x86/math-emu/fpu_aux.c
@@ -30,7 +30,7 @@ static void fclex(void)
}

/* Needs to be externally visible */
-void finit_soft_fpu(struct i387_soft_struct *soft)
+void fpstate_init_soft(struct i387_soft_struct *soft)
{
struct address *oaddr, *iaddr;
memset(soft, 0, sizeof(*soft));
@@ -52,7 +52,7 @@ void finit_soft_fpu(struct i387_soft_struct *soft)

void finit(void)
{
- finit_soft_fpu(&current->thread.fpu.state.soft);
+ fpstate_init_soft(&current->thread.fpu.state.soft);
}

/*
--
2.1.0

2015-05-05 17:59:46

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 190/208] x86/fpu: Create 'union thread_xstate' helper for fpstate_init()

fpstate_init() only uses fpu->state, so pass that in to it.

This enables the cleanup we will do in the next patch.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/fpu/internal.h | 2 +-
arch/x86/kernel/fpu/core.c | 14 +++++++-------
arch/x86/kernel/fpu/xstate.c | 2 +-
arch/x86/kvm/x86.c | 2 +-
4 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 0236ae6ffc26..b74aa4329aeb 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -29,7 +29,7 @@ extern void fpu__init_system(struct cpuinfo_x86 *c);

extern void fpu__activate_curr(struct fpu *fpu);

-extern void fpstate_init(struct fpu *fpu);
+extern void fpstate_init(union thread_xstate *state);
#ifdef CONFIG_MATH_EMULATION
extern void fpstate_init_soft(struct i387_soft_struct *soft);
#else
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 8eeba60bf78b..2642a1ebed2a 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -202,19 +202,19 @@ static inline void fpstate_init_fstate(struct i387_fsave_struct *fp)
fp->fos = 0xffff0000u;
}

-void fpstate_init(struct fpu *fpu)
+void fpstate_init(union thread_xstate *state)
{
if (!cpu_has_fpu) {
- fpstate_init_soft(&fpu->state.soft);
+ fpstate_init_soft(&state->soft);
return;
}

- memset(&fpu->state, 0, xstate_size);
+ memset(state, 0, xstate_size);

if (cpu_has_fxsr)
- fpstate_init_fxstate(&fpu->state.fxsave);
+ fpstate_init_fxstate(&state->fxsave);
else
- fpstate_init_fstate(&fpu->state.fsave);
+ fpstate_init_fstate(&state->fsave);
}
EXPORT_SYMBOL_GPL(fpstate_init);

@@ -282,7 +282,7 @@ void fpu__activate_curr(struct fpu *fpu)
WARN_ON_ONCE(fpu != &current->thread.fpu);

if (!fpu->fpstate_active) {
- fpstate_init(fpu);
+ fpstate_init(&fpu->state);

/* Safe to do for the current task: */
fpu->fpstate_active = 1;
@@ -321,7 +321,7 @@ static void fpu__activate_stopped(struct fpu *child_fpu)
if (child_fpu->fpstate_active) {
child_fpu->last_cpu = -1;
} else {
- fpstate_init(child_fpu);
+ fpstate_init(&child_fpu->state);

/* Safe to do for stopped child tasks: */
child_fpu->fpstate_active = 1;
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 8285d4b40763..afbd58277430 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -456,7 +456,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)

if (__copy_from_user(&fpu->state.xsave, buf_fx, state_size) ||
__copy_from_user(&env, buf, sizeof(env))) {
- fpstate_init(fpu);
+ fpstate_init(&fpu->state);
err = -1;
} else {
sanitize_restored_xstate(tsk, &env, xfeatures, fx_only);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f4438179398b..3d811bb2728f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7004,7 +7004,7 @@ int kvm_arch_vcpu_ioctl_set_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)

static void fx_init(struct kvm_vcpu *vcpu)
{
- fpstate_init(&vcpu->arch.guest_fpu);
+ fpstate_init(&vcpu->arch.guest_fpu.state);
if (cpu_has_xsaves)
vcpu->arch.guest_fpu.state.xsave.header.xcomp_bv =
host_xcr0 | XSTATE_COMPACTION_ENABLED;
--
2.1.0

2015-05-05 18:05:45

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 191/208] x86/fpu: Generalize 'init_xstate_ctx'

So the handling of init_xstate_ctx has a layering violation: both
'struct xsave_struct' and 'union thread_xstate' have a
'struct i387_fxsave_struct' member:

xsave_struct::i387
thread_xstate::fxsave

The handling of init_xstate_ctx is generic, it is used on all
CPUs, with or without XSAVE instruction. So it's confusing how
the generic code passes around and handles an XSAVE specific
format.

What we really want is for init_xstate_ctx to be a proper
fpstate and we use its ::fxsave and ::xsave members, as
appropriate.

Since the xsave_struct::i387 and thread_xstate::fxsave aliases
each other this is not a functional problem.

So implement this, and move init_xstate_ctx to the generic FPU
code in the process.

Also, since init_xstate_ctx is not XSAVE specific anymore,
rename it to init_fpstate, and mark it __read_mostly,
because it's only modified once during bootup, and used
as a reference fpstate later on.

There's no change in functionality.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/fpu/internal.h | 6 ++++--
arch/x86/include/asm/fpu/xstate.h | 1 -
arch/x86/kernel/fpu/core.c | 6 ++++++
arch/x86/kernel/fpu/init.c | 2 +-
arch/x86/kernel/fpu/xstate.c | 19 +++++++------------
5 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index b74aa4329aeb..792fdbe64179 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -22,6 +22,8 @@

extern unsigned int mxcsr_feature_mask;

+extern union thread_xstate init_fpstate;
+
extern void fpu__init_cpu(void);
extern void fpu__init_system_xstate(void);
extern void fpu__init_cpu_xstate(void);
@@ -342,9 +344,9 @@ static inline void fpregs_deactivate(struct fpu *fpu)
static inline void restore_init_xstate(void)
{
if (use_xsave())
- xrstor_state(&init_xstate_ctx, -1);
+ xrstor_state(&init_fpstate.xsave, -1);
else
- fxrstor_checking(&init_xstate_ctx.i387);
+ fxrstor_checking(&init_fpstate.fxsave);
}

/*
diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index afd21329c585..3051280887b8 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -37,7 +37,6 @@
extern unsigned int xstate_size;
extern u64 xfeatures_mask;
extern u64 xstate_fx_sw_bytes[USER_XSTATE_FX_SW_WORDS];
-extern struct xsave_struct init_xstate_ctx;

extern void update_regset_xstate_info(unsigned int size, u64 xstate_mask);

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 2642a1ebed2a..43c3f40aa447 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -13,6 +13,12 @@
#include <linux/hardirq.h>

/*
+ * Represents the initial FPU state. It's mostly (but not completely) zeroes,
+ * depending on the FPU hardware format:
+ */
+union thread_xstate init_fpstate __read_mostly;
+
+/*
* Track whether the kernel is using the FPU state
* currently.
*
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index 5a7e57078935..93bc11a5812c 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -121,7 +121,7 @@ static void fpu__init_system_generic(void)
* Set up the legacy init FPU context. (xstate init might overwrite this
* with a more modern format, if the CPU supports it.)
*/
- fpstate_init_fxstate(&init_xstate_ctx.i387);
+ fpstate_init_fxstate(&init_fpstate.fxsave);

fpu__init_system_mxcsr();
}
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index afbd58277430..527d4bf4f304 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -31,11 +31,6 @@ static const char *xfeature_names[] =
*/
u64 xfeatures_mask __read_mostly;

-/*
- * Represents init state for the supported extended state.
- */
-struct xsave_struct init_xstate_ctx;
-
static struct _fpx_sw_bytes fx_sw_reserved, fx_sw_reserved_ia32;
static unsigned int xstate_offsets[XFEATURES_NR_MAX], xstate_sizes[XFEATURES_NR_MAX];
static unsigned int xstate_comp_offsets[sizeof(xfeatures_mask)*8];
@@ -150,7 +145,7 @@ void fpstate_sanitize_xstate(struct fpu *fpu)
int size = xstate_sizes[feature_bit];

memcpy((void *)fx + offset,
- (void *)&init_xstate_ctx + offset,
+ (void *)&init_fpstate.xsave + offset,
size);
}

@@ -377,12 +372,12 @@ static inline int restore_user_xstate(void __user *buf, u64 xbv, int fx_only)
if (use_xsave()) {
if ((unsigned long)buf % 64 || fx_only) {
u64 init_bv = xfeatures_mask & ~XSTATE_FPSSE;
- xrstor_state(&init_xstate_ctx, init_bv);
+ xrstor_state(&init_fpstate.xsave, init_bv);
return fxrstor_user(buf);
} else {
u64 init_bv = xfeatures_mask & ~xbv;
if (unlikely(init_bv))
- xrstor_state(&init_xstate_ctx, init_bv);
+ xrstor_state(&init_fpstate.xsave, init_bv);
return xrestore_user(buf, xbv);
}
} else if (use_fxsr()) {
@@ -665,20 +660,20 @@ static void setup_init_fpu_buf(void)
print_xstate_features();

if (cpu_has_xsaves) {
- init_xstate_ctx.header.xcomp_bv = (u64)1 << 63 | xfeatures_mask;
- init_xstate_ctx.header.xfeatures = xfeatures_mask;
+ init_fpstate.xsave.header.xcomp_bv = (u64)1 << 63 | xfeatures_mask;
+ init_fpstate.xsave.header.xfeatures = xfeatures_mask;
}

/*
* Init all the features state with header_bv being 0x0
*/
- xrstor_state_booting(&init_xstate_ctx, -1);
+ xrstor_state_booting(&init_fpstate.xsave, -1);

/*
* Dump the init state again. This is to identify the init state
* of any feature which is not represented by all zero's.
*/
- xsave_state_booting(&init_xstate_ctx);
+ xsave_state_booting(&init_fpstate.xsave);
}

/*
--
2.1.0

2015-05-05 18:05:15

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 192/208] x86/fpu: Move restore_init_xstate() out of fpu/internal.h

Move restore_init_xstate() next to its sole caller.

Also rename it to copy_init_fpstate_to_fpregs() and add
some comments about what it does.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/fpu/internal.h | 8 --------
arch/x86/kernel/fpu/core.c | 14 +++++++++++++-
2 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 792fdbe64179..a1810eb39afa 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -341,14 +341,6 @@ static inline void fpregs_deactivate(struct fpu *fpu)
__fpregs_deactivate_hw();
}

-static inline void restore_init_xstate(void)
-{
- if (use_xsave())
- xrstor_state(&init_fpstate.xsave, -1);
- else
- fxrstor_checking(&init_fpstate.fxsave);
-}
-
/*
* Definitions for the eXtended Control Register instructions
*/
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 43c3f40aa447..c9878a7c21cf 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -392,6 +392,18 @@ void fpu__drop(struct fpu *fpu)
}

/*
+ * Clear FPU registers by setting them up from
+ * the init fpstate:
+ */
+static inline void copy_init_fpstate_to_fpregs(void)
+{
+ if (use_xsave())
+ xrstor_state(&init_fpstate.xsave, -1);
+ else
+ fxrstor_checking(&init_fpstate.fxsave);
+}
+
+/*
* Clear the FPU state back to init state.
*
* Called by sys_execve(), by the signal handler code and by various
@@ -409,7 +421,7 @@ void fpu__clear(struct fpu *fpu)
fpu__activate_curr(fpu);
user_fpu_begin();
}
- restore_init_xstate();
+ copy_init_fpstate_to_fpregs();
}
}

--
2.1.0

2015-05-05 18:04:46

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 193/208] x86/fpu: Rename all the fpregs, xregs, fxregs and fregs handling functions

Standardize the naming of the various functions that copy register
content in specific FPU context formats:

copy_fxregs_to_kernel() # was: fpu_fxsave()
copy_xregs_to_kernel() # was: xsave_state()

copy_kernel_to_fregs() # was: frstor_checking()
copy_kernel_to_fxregs() # was: fxrstor_checking()
copy_kernel_to_xregs() # was: fpu_xrstor_checking()
copy_kernel_to_xregs_booting() # was: xrstor_state_booting()

copy_fregs_to_user() # was: fsave_user()
copy_fxregs_to_user() # was: fxsave_user()
copy_xregs_to_user() # was: xsave_user()

copy_user_to_fregs() # was: frstor_user()
copy_user_to_fxregs() # was: fxrstor_user()
copy_user_to_xregs() # was: xrestore_user()
copy_user_to_fpregs_zeroing() # was: restore_user_xstate()

Eliminate fpu_xrstor_checking(), because it was just a wrapper.

No change in functionality.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/fpu/internal.h | 30 +++++++++++++++---------------
arch/x86/include/asm/fpu/xstate.h | 20 ++++++--------------
arch/x86/kernel/fpu/core.c | 4 ++--
arch/x86/kernel/fpu/xstate.c | 28 ++++++++++++++--------------
arch/x86/mm/mpx.c | 2 +-
5 files changed, 38 insertions(+), 46 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index a1810eb39afa..f23ea10d3a1f 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -133,57 +133,57 @@ extern void fpstate_sanitize_xstate(struct fpu *fpu);
err; \
})

-static inline int fsave_user(struct i387_fsave_struct __user *fx)
+static inline int copy_fregs_to_user(struct i387_fsave_struct __user *fx)
{
return user_insn(fnsave %[fx]; fwait, [fx] "=m" (*fx), "m" (*fx));
}

-static inline int fxsave_user(struct i387_fxsave_struct __user *fx)
+static inline int copy_fxregs_to_user(struct i387_fxsave_struct __user *fx)
{
if (config_enabled(CONFIG_X86_32))
return user_insn(fxsave %[fx], [fx] "=m" (*fx), "m" (*fx));
else if (config_enabled(CONFIG_AS_FXSAVEQ))
return user_insn(fxsaveq %[fx], [fx] "=m" (*fx), "m" (*fx));

- /* See comment in fpu_fxsave() below. */
+ /* See comment in copy_fxregs_to_kernel() below. */
return user_insn(rex64/fxsave (%[fx]), "=m" (*fx), [fx] "R" (fx));
}

-static inline int fxrstor_checking(struct i387_fxsave_struct *fx)
+static inline int copy_kernel_to_fxregs(struct i387_fxsave_struct *fx)
{
if (config_enabled(CONFIG_X86_32))
return check_insn(fxrstor %[fx], "=m" (*fx), [fx] "m" (*fx));
else if (config_enabled(CONFIG_AS_FXSAVEQ))
return check_insn(fxrstorq %[fx], "=m" (*fx), [fx] "m" (*fx));

- /* See comment in fpu_fxsave() below. */
+ /* See comment in copy_fxregs_to_kernel() below. */
return check_insn(rex64/fxrstor (%[fx]), "=m" (*fx), [fx] "R" (fx),
"m" (*fx));
}

-static inline int fxrstor_user(struct i387_fxsave_struct __user *fx)
+static inline int copy_user_to_fxregs(struct i387_fxsave_struct __user *fx)
{
if (config_enabled(CONFIG_X86_32))
return user_insn(fxrstor %[fx], "=m" (*fx), [fx] "m" (*fx));
else if (config_enabled(CONFIG_AS_FXSAVEQ))
return user_insn(fxrstorq %[fx], "=m" (*fx), [fx] "m" (*fx));

- /* See comment in fpu_fxsave() below. */
+ /* See comment in copy_fxregs_to_kernel() below. */
return user_insn(rex64/fxrstor (%[fx]), "=m" (*fx), [fx] "R" (fx),
"m" (*fx));
}

-static inline int frstor_checking(struct i387_fsave_struct *fx)
+static inline int copy_kernel_to_fregs(struct i387_fsave_struct *fx)
{
return check_insn(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));
}

-static inline int frstor_user(struct i387_fsave_struct __user *fx)
+static inline int copy_user_to_fregs(struct i387_fsave_struct __user *fx)
{
return user_insn(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));
}

-static inline void fpu_fxsave(struct fpu *fpu)
+static inline void copy_fxregs_to_kernel(struct fpu *fpu)
{
if (config_enabled(CONFIG_X86_32))
asm volatile( "fxsave %[fx]" : [fx] "=m" (fpu->state.fxsave));
@@ -230,12 +230,12 @@ static inline void fpu_fxsave(struct fpu *fpu)
static inline int copy_fpregs_to_fpstate(struct fpu *fpu)
{
if (likely(use_xsave())) {
- xsave_state(&fpu->state.xsave);
+ copy_xregs_to_kernel(&fpu->state.xsave);
return 1;
}

if (likely(use_fxsr())) {
- fpu_fxsave(fpu);
+ copy_fxregs_to_kernel(fpu);
return 1;
}

@@ -251,11 +251,11 @@ static inline int copy_fpregs_to_fpstate(struct fpu *fpu)
static inline int __copy_fpstate_to_fpregs(struct fpu *fpu)
{
if (use_xsave())
- return fpu_xrstor_checking(&fpu->state.xsave);
+ return copy_kernel_to_xregs(&fpu->state.xsave, -1);
else if (use_fxsr())
- return fxrstor_checking(&fpu->state.fxsave);
+ return copy_kernel_to_fxregs(&fpu->state.fxsave);
else
- return frstor_checking(&fpu->state.fsave);
+ return copy_kernel_to_fregs(&fpu->state.fsave);
}

static inline int copy_fpstate_to_fpregs(struct fpu *fpu)
diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 3051280887b8..7f59480697a3 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -58,7 +58,7 @@ extern void update_regset_xstate_info(unsigned int size, u64 xstate_mask);
* This function is called only during boot time when x86 caps are not set
* up and alternative can not be used yet.
*/
-static inline int xsave_state_booting(struct xsave_struct *fx)
+static inline int copy_xregs_to_kernel_booting(struct xsave_struct *fx)
{
u64 mask = -1;
u32 lmask = mask;
@@ -86,7 +86,7 @@ static inline int xsave_state_booting(struct xsave_struct *fx)
* This function is called only during boot time when x86 caps are not set
* up and alternative can not be used yet.
*/
-static inline int xrstor_state_booting(struct xsave_struct *fx, u64 mask)
+static inline int copy_kernel_to_xregs_booting(struct xsave_struct *fx, u64 mask)
{
u32 lmask = mask;
u32 hmask = mask >> 32;
@@ -112,7 +112,7 @@ static inline int xrstor_state_booting(struct xsave_struct *fx, u64 mask)
/*
* Save processor xstate to xsave area.
*/
-static inline int xsave_state(struct xsave_struct *fx)
+static inline int copy_xregs_to_kernel(struct xsave_struct *fx)
{
u64 mask = -1;
u32 lmask = mask;
@@ -151,7 +151,7 @@ static inline int xsave_state(struct xsave_struct *fx)
/*
* Restore processor xstate from xsave area.
*/
-static inline int xrstor_state(struct xsave_struct *fx, u64 mask)
+static inline int copy_kernel_to_xregs(struct xsave_struct *fx, u64 mask)
{
int err = 0;
u32 lmask = mask;
@@ -177,14 +177,6 @@ static inline int xrstor_state(struct xsave_struct *fx, u64 mask)
}

/*
- * Restore xstate context for new process during context switch.
- */
-static inline int fpu_xrstor_checking(struct xsave_struct *fx)
-{
- return xrstor_state(fx, -1);
-}
-
-/*
* Save xstate to user space xsave area.
*
* We don't use modified optimization because xrstor/xrstors might track
@@ -194,7 +186,7 @@ static inline int fpu_xrstor_checking(struct xsave_struct *fx)
* backward compatibility for old applications which don't understand
* compacted format of xsave area.
*/
-static inline int xsave_user(struct xsave_struct __user *buf)
+static inline int copy_xregs_to_user(struct xsave_struct __user *buf)
{
int err;

@@ -218,7 +210,7 @@ static inline int xsave_user(struct xsave_struct __user *buf)
/*
* Restore xstate from user space xsave area.
*/
-static inline int xrestore_user(struct xsave_struct __user *buf, u64 mask)
+static inline int copy_user_to_xregs(struct xsave_struct __user *buf, u64 mask)
{
int err = 0;
struct xsave_struct *xstate = ((__force struct xsave_struct *)buf);
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index c9878a7c21cf..02eaec4722ba 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -398,9 +398,9 @@ void fpu__drop(struct fpu *fpu)
static inline void copy_init_fpstate_to_fpregs(void)
{
if (use_xsave())
- xrstor_state(&init_fpstate.xsave, -1);
+ copy_kernel_to_xregs(&init_fpstate.xsave, -1);
else
- fxrstor_checking(&init_fpstate.fxsave);
+ copy_kernel_to_fxregs(&init_fpstate.fxsave);
}

/*
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 527d4bf4f304..336b3dae59ca 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -260,11 +260,11 @@ static inline int copy_fpregs_to_sigframe(struct xsave_struct __user *buf)
int err;

if (use_xsave())
- err = xsave_user(buf);
+ err = copy_xregs_to_user(buf);
else if (use_fxsr())
- err = fxsave_user((struct i387_fxsave_struct __user *) buf);
+ err = copy_fxregs_to_user((struct i387_fxsave_struct __user *) buf);
else
- err = fsave_user((struct i387_fsave_struct __user *) buf);
+ err = copy_fregs_to_user((struct i387_fsave_struct __user *) buf);

if (unlikely(err) && __clear_user(buf, xstate_size))
err = -EFAULT;
@@ -314,7 +314,7 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
return -1;
/* Update the thread's fxstate to save the fsave header. */
if (ia32_fxstate)
- fpu_fxsave(&tsk->thread.fpu);
+ copy_fxregs_to_kernel(&tsk->thread.fpu);
} else {
fpstate_sanitize_xstate(&tsk->thread.fpu);
if (__copy_to_user(buf_fx, xsave, xstate_size))
@@ -367,23 +367,23 @@ sanitize_restored_xstate(struct task_struct *tsk,
/*
* Restore the extended state if present. Otherwise, restore the FP/SSE state.
*/
-static inline int restore_user_xstate(void __user *buf, u64 xbv, int fx_only)
+static inline int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only)
{
if (use_xsave()) {
if ((unsigned long)buf % 64 || fx_only) {
u64 init_bv = xfeatures_mask & ~XSTATE_FPSSE;
- xrstor_state(&init_fpstate.xsave, init_bv);
- return fxrstor_user(buf);
+ copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+ return copy_user_to_fxregs(buf);
} else {
u64 init_bv = xfeatures_mask & ~xbv;
if (unlikely(init_bv))
- xrstor_state(&init_fpstate.xsave, init_bv);
- return xrestore_user(buf, xbv);
+ copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+ return copy_user_to_xregs(buf, xbv);
}
} else if (use_fxsr()) {
- return fxrstor_user(buf);
+ return copy_user_to_fxregs(buf);
} else
- return frstor_user(buf);
+ return copy_user_to_fregs(buf);
}

static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
@@ -471,7 +471,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
* state to the registers directly (with exceptions handled).
*/
user_fpu_begin();
- if (restore_user_xstate(buf_fx, xfeatures, fx_only)) {
+ if (copy_user_to_fpregs_zeroing(buf_fx, xfeatures, fx_only)) {
fpu__clear(fpu);
return -1;
}
@@ -667,13 +667,13 @@ static void setup_init_fpu_buf(void)
/*
* Init all the features state with header_bv being 0x0
*/
- xrstor_state_booting(&init_fpstate.xsave, -1);
+ copy_kernel_to_xregs_booting(&init_fpstate.xsave, -1);

/*
* Dump the init state again. This is to identify the init state
* of any feature which is not represented by all zero's.
*/
- xsave_state_booting(&init_fpstate.xsave);
+ copy_xregs_to_kernel_booting(&init_fpstate.xsave);
}

/*
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index ea5b367b63a9..5e20bacee210 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -389,7 +389,7 @@ int mpx_enable_management(struct task_struct *tsk)
* directory into XSAVE/XRSTOR Save Area and enable MPX through
* XRSTOR instruction.
*
- * xsave_state() is expected to be very expensive. Storing the bounds
+ * copy_xregs_to_kernel() is expected to be very expensive. Storing the bounds
* directory here means that we do not have to do xsave in the unmap
* path; we can just use mm->bd_addr instead.
*/
--
2.1.0

2015-05-05 17:59:53

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 194/208] x86/fpu: Factor out fpu/signal.c

fpu/xstate.c has a lot of generic FPU signal frame handling routines,
move them into a separate file: fpu/signal.c.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/fpu/signal.h | 2 +
arch/x86/kernel/fpu/Makefile | 2 +-
arch/x86/kernel/fpu/signal.c | 404 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
arch/x86/kernel/fpu/xstate.c | 394 +-------------------------------------------------------
4 files changed, 409 insertions(+), 393 deletions(-)

diff --git a/arch/x86/include/asm/fpu/signal.h b/arch/x86/include/asm/fpu/signal.h
index 0803dc2aba80..7358e9d61f1e 100644
--- a/arch/x86/include/asm/fpu/signal.h
+++ b/arch/x86/include/asm/fpu/signal.h
@@ -28,4 +28,6 @@ unsigned long
fpu__alloc_mathframe(unsigned long sp, int ia32_frame,
unsigned long *buf_fx, unsigned long *size);

+extern void fpu__init_prepare_fx_sw_frame(void);
+
#endif /* _ASM_X86_FPU_SIGNAL_H */
diff --git a/arch/x86/kernel/fpu/Makefile b/arch/x86/kernel/fpu/Makefile
index 6ae59bccdd2f..5c697ded57f2 100644
--- a/arch/x86/kernel/fpu/Makefile
+++ b/arch/x86/kernel/fpu/Makefile
@@ -2,4 +2,4 @@
# Build rules for the FPU support code:
#

-obj-y += init.o bugs.o core.o xstate.o
+obj-y += init.o bugs.o core.o signal.o xstate.o
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
new file mode 100644
index 000000000000..8d0c26ab5123
--- /dev/null
+++ b/arch/x86/kernel/fpu/signal.c
@@ -0,0 +1,404 @@
+/*
+ * FPU signal frame handling routines.
+ */
+
+#include <linux/compat.h>
+#include <linux/cpu.h>
+
+#include <asm/fpu/internal.h>
+#include <asm/fpu/signal.h>
+#include <asm/fpu/regset.h>
+
+#include <asm/sigframe.h>
+
+static struct _fpx_sw_bytes fx_sw_reserved, fx_sw_reserved_ia32;
+
+/*
+ * Check for the presence of extended state information in the
+ * user fpstate pointer in the sigcontext.
+ */
+static inline int check_for_xstate(struct i387_fxsave_struct __user *buf,
+ void __user *fpstate,
+ struct _fpx_sw_bytes *fx_sw)
+{
+ int min_xstate_size = sizeof(struct i387_fxsave_struct) +
+ sizeof(struct xstate_header);
+ unsigned int magic2;
+
+ if (__copy_from_user(fx_sw, &buf->sw_reserved[0], sizeof(*fx_sw)))
+ return -1;
+
+ /* Check for the first magic field and other error scenarios. */
+ if (fx_sw->magic1 != FP_XSTATE_MAGIC1 ||
+ fx_sw->xstate_size < min_xstate_size ||
+ fx_sw->xstate_size > xstate_size ||
+ fx_sw->xstate_size > fx_sw->extended_size)
+ return -1;
+
+ /*
+ * Check for the presence of second magic word at the end of memory
+ * layout. This detects the case where the user just copied the legacy
+ * fpstate layout with out copying the extended state information
+ * in the memory layout.
+ */
+ if (__get_user(magic2, (__u32 __user *)(fpstate + fx_sw->xstate_size))
+ || magic2 != FP_XSTATE_MAGIC2)
+ return -1;
+
+ return 0;
+}
+
+/*
+ * Signal frame handlers.
+ */
+static inline int save_fsave_header(struct task_struct *tsk, void __user *buf)
+{
+ if (use_fxsr()) {
+ struct xsave_struct *xsave = &tsk->thread.fpu.state.xsave;
+ struct user_i387_ia32_struct env;
+ struct _fpstate_ia32 __user *fp = buf;
+
+ convert_from_fxsr(&env, tsk);
+
+ if (__copy_to_user(buf, &env, sizeof(env)) ||
+ __put_user(xsave->i387.swd, &fp->status) ||
+ __put_user(X86_FXSR_MAGIC, &fp->magic))
+ return -1;
+ } else {
+ struct i387_fsave_struct __user *fp = buf;
+ u32 swd;
+ if (__get_user(swd, &fp->swd) || __put_user(swd, &fp->status))
+ return -1;
+ }
+
+ return 0;
+}
+
+static inline int save_xstate_epilog(void __user *buf, int ia32_frame)
+{
+ struct xsave_struct __user *x = buf;
+ struct _fpx_sw_bytes *sw_bytes;
+ u32 xfeatures;
+ int err;
+
+ /* Setup the bytes not touched by the [f]xsave and reserved for SW. */
+ sw_bytes = ia32_frame ? &fx_sw_reserved_ia32 : &fx_sw_reserved;
+ err = __copy_to_user(&x->i387.sw_reserved, sw_bytes, sizeof(*sw_bytes));
+
+ if (!use_xsave())
+ return err;
+
+ err |= __put_user(FP_XSTATE_MAGIC2, (__u32 *)(buf + xstate_size));
+
+ /*
+ * Read the xfeatures which we copied (directly from the cpu or
+ * from the state in task struct) to the user buffers.
+ */
+ err |= __get_user(xfeatures, (__u32 *)&x->header.xfeatures);
+
+ /*
+ * For legacy compatible, we always set FP/SSE bits in the bit
+ * vector while saving the state to the user context. This will
+ * enable us capturing any changes(during sigreturn) to
+ * the FP/SSE bits by the legacy applications which don't touch
+ * xfeatures in the xsave header.
+ *
+ * xsave aware apps can change the xfeatures in the xsave
+ * header as well as change any contents in the memory layout.
+ * xrestore as part of sigreturn will capture all the changes.
+ */
+ xfeatures |= XSTATE_FPSSE;
+
+ err |= __put_user(xfeatures, (__u32 *)&x->header.xfeatures);
+
+ return err;
+}
+
+static inline int copy_fpregs_to_sigframe(struct xsave_struct __user *buf)
+{
+ int err;
+
+ if (use_xsave())
+ err = copy_xregs_to_user(buf);
+ else if (use_fxsr())
+ err = copy_fxregs_to_user((struct i387_fxsave_struct __user *) buf);
+ else
+ err = copy_fregs_to_user((struct i387_fsave_struct __user *) buf);
+
+ if (unlikely(err) && __clear_user(buf, xstate_size))
+ err = -EFAULT;
+ return err;
+}
+
+/*
+ * Save the fpu, extended register state to the user signal frame.
+ *
+ * 'buf_fx' is the 64-byte aligned pointer at which the [f|fx|x]save
+ * state is copied.
+ * 'buf' points to the 'buf_fx' or to the fsave header followed by 'buf_fx'.
+ *
+ * buf == buf_fx for 64-bit frames and 32-bit fsave frame.
+ * buf != buf_fx for 32-bit frames with fxstate.
+ *
+ * If the fpu, extended register state is live, save the state directly
+ * to the user frame pointed by the aligned pointer 'buf_fx'. Otherwise,
+ * copy the thread's fpu state to the user frame starting at 'buf_fx'.
+ *
+ * If this is a 32-bit frame with fxstate, put a fsave header before
+ * the aligned state at 'buf_fx'.
+ *
+ * For [f]xsave state, update the SW reserved fields in the [f]xsave frame
+ * indicating the absence/presence of the extended state to the user.
+ */
+int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
+{
+ struct xsave_struct *xsave = &current->thread.fpu.state.xsave;
+ struct task_struct *tsk = current;
+ int ia32_fxstate = (buf != buf_fx);
+
+ ia32_fxstate &= (config_enabled(CONFIG_X86_32) ||
+ config_enabled(CONFIG_IA32_EMULATION));
+
+ if (!access_ok(VERIFY_WRITE, buf, size))
+ return -EACCES;
+
+ if (!static_cpu_has(X86_FEATURE_FPU))
+ return fpregs_soft_get(current, NULL, 0,
+ sizeof(struct user_i387_ia32_struct), NULL,
+ (struct _fpstate_ia32 __user *) buf) ? -1 : 1;
+
+ if (fpregs_active()) {
+ /* Save the live register state to the user directly. */
+ if (copy_fpregs_to_sigframe(buf_fx))
+ return -1;
+ /* Update the thread's fxstate to save the fsave header. */
+ if (ia32_fxstate)
+ copy_fxregs_to_kernel(&tsk->thread.fpu);
+ } else {
+ fpstate_sanitize_xstate(&tsk->thread.fpu);
+ if (__copy_to_user(buf_fx, xsave, xstate_size))
+ return -1;
+ }
+
+ /* Save the fsave header for the 32-bit frames. */
+ if ((ia32_fxstate || !use_fxsr()) && save_fsave_header(tsk, buf))
+ return -1;
+
+ if (use_fxsr() && save_xstate_epilog(buf_fx, ia32_fxstate))
+ return -1;
+
+ return 0;
+}
+
+static inline void
+sanitize_restored_xstate(struct task_struct *tsk,
+ struct user_i387_ia32_struct *ia32_env,
+ u64 xfeatures, int fx_only)
+{
+ struct xsave_struct *xsave = &tsk->thread.fpu.state.xsave;
+ struct xstate_header *header = &xsave->header;
+
+ if (use_xsave()) {
+ /* These bits must be zero. */
+ memset(header->reserved, 0, 48);
+
+ /*
+ * Init the state that is not present in the memory
+ * layout and not enabled by the OS.
+ */
+ if (fx_only)
+ header->xfeatures = XSTATE_FPSSE;
+ else
+ header->xfeatures &= (xfeatures_mask & xfeatures);
+ }
+
+ if (use_fxsr()) {
+ /*
+ * mscsr reserved bits must be masked to zero for security
+ * reasons.
+ */
+ xsave->i387.mxcsr &= mxcsr_feature_mask;
+
+ convert_to_fxsr(tsk, ia32_env);
+ }
+}
+
+/*
+ * Restore the extended state if present. Otherwise, restore the FP/SSE state.
+ */
+static inline int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only)
+{
+ if (use_xsave()) {
+ if ((unsigned long)buf % 64 || fx_only) {
+ u64 init_bv = xfeatures_mask & ~XSTATE_FPSSE;
+ copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+ return copy_user_to_fxregs(buf);
+ } else {
+ u64 init_bv = xfeatures_mask & ~xbv;
+ if (unlikely(init_bv))
+ copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+ return copy_user_to_xregs(buf, xbv);
+ }
+ } else if (use_fxsr()) {
+ return copy_user_to_fxregs(buf);
+ } else
+ return copy_user_to_fregs(buf);
+}
+
+static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
+{
+ int ia32_fxstate = (buf != buf_fx);
+ struct task_struct *tsk = current;
+ struct fpu *fpu = &tsk->thread.fpu;
+ int state_size = xstate_size;
+ u64 xfeatures = 0;
+ int fx_only = 0;
+
+ ia32_fxstate &= (config_enabled(CONFIG_X86_32) ||
+ config_enabled(CONFIG_IA32_EMULATION));
+
+ if (!buf) {
+ fpu__clear(fpu);
+ return 0;
+ }
+
+ if (!access_ok(VERIFY_READ, buf, size))
+ return -EACCES;
+
+ fpu__activate_curr(fpu);
+
+ if (!static_cpu_has(X86_FEATURE_FPU))
+ return fpregs_soft_set(current, NULL,
+ 0, sizeof(struct user_i387_ia32_struct),
+ NULL, buf) != 0;
+
+ if (use_xsave()) {
+ struct _fpx_sw_bytes fx_sw_user;
+ if (unlikely(check_for_xstate(buf_fx, buf_fx, &fx_sw_user))) {
+ /*
+ * Couldn't find the extended state information in the
+ * memory layout. Restore just the FP/SSE and init all
+ * the other extended state.
+ */
+ state_size = sizeof(struct i387_fxsave_struct);
+ fx_only = 1;
+ } else {
+ state_size = fx_sw_user.xstate_size;
+ xfeatures = fx_sw_user.xfeatures;
+ }
+ }
+
+ if (ia32_fxstate) {
+ /*
+ * For 32-bit frames with fxstate, copy the user state to the
+ * thread's fpu state, reconstruct fxstate from the fsave
+ * header. Sanitize the copied state etc.
+ */
+ struct fpu *fpu = &tsk->thread.fpu;
+ struct user_i387_ia32_struct env;
+ int err = 0;
+
+ /*
+ * Drop the current fpu which clears fpu->fpstate_active. This ensures
+ * that any context-switch during the copy of the new state,
+ * avoids the intermediate state from getting restored/saved.
+ * Thus avoiding the new restored state from getting corrupted.
+ * We will be ready to restore/save the state only after
+ * fpu->fpstate_active is again set.
+ */
+ fpu__drop(fpu);
+
+ if (__copy_from_user(&fpu->state.xsave, buf_fx, state_size) ||
+ __copy_from_user(&env, buf, sizeof(env))) {
+ fpstate_init(&fpu->state);
+ err = -1;
+ } else {
+ sanitize_restored_xstate(tsk, &env, xfeatures, fx_only);
+ }
+
+ fpu->fpstate_active = 1;
+ if (use_eager_fpu()) {
+ preempt_disable();
+ fpu__restore();
+ preempt_enable();
+ }
+
+ return err;
+ } else {
+ /*
+ * For 64-bit frames and 32-bit fsave frames, restore the user
+ * state to the registers directly (with exceptions handled).
+ */
+ user_fpu_begin();
+ if (copy_user_to_fpregs_zeroing(buf_fx, xfeatures, fx_only)) {
+ fpu__clear(fpu);
+ return -1;
+ }
+ }
+
+ return 0;
+}
+
+static inline int xstate_sigframe_size(void)
+{
+ return use_xsave() ? xstate_size + FP_XSTATE_MAGIC2_SIZE : xstate_size;
+}
+
+/*
+ * Restore FPU state from a sigframe:
+ */
+int fpu__restore_sig(void __user *buf, int ia32_frame)
+{
+ void __user *buf_fx = buf;
+ int size = xstate_sigframe_size();
+
+ if (ia32_frame && use_fxsr()) {
+ buf_fx = buf + sizeof(struct i387_fsave_struct);
+ size += sizeof(struct i387_fsave_struct);
+ }
+
+ return __fpu__restore_sig(buf, buf_fx, size);
+}
+
+unsigned long
+fpu__alloc_mathframe(unsigned long sp, int ia32_frame,
+ unsigned long *buf_fx, unsigned long *size)
+{
+ unsigned long frame_size = xstate_sigframe_size();
+
+ *buf_fx = sp = round_down(sp - frame_size, 64);
+ if (ia32_frame && use_fxsr()) {
+ frame_size += sizeof(struct i387_fsave_struct);
+ sp -= sizeof(struct i387_fsave_struct);
+ }
+
+ *size = frame_size;
+
+ return sp;
+}
+/*
+ * Prepare the SW reserved portion of the fxsave memory layout, indicating
+ * the presence of the extended state information in the memory layout
+ * pointed by the fpstate pointer in the sigcontext.
+ * This will be saved when ever the FP and extended state context is
+ * saved on the user stack during the signal handler delivery to the user.
+ */
+void fpu__init_prepare_fx_sw_frame(void)
+{
+ int fsave_header_size = sizeof(struct i387_fsave_struct);
+ int size = xstate_size + FP_XSTATE_MAGIC2_SIZE;
+
+ if (config_enabled(CONFIG_X86_32))
+ size += fsave_header_size;
+
+ fx_sw_reserved.magic1 = FP_XSTATE_MAGIC1;
+ fx_sw_reserved.extended_size = size;
+ fx_sw_reserved.xfeatures = xfeatures_mask;
+ fx_sw_reserved.xstate_size = xstate_size;
+
+ if (config_enabled(CONFIG_IA32_EMULATION)) {
+ fx_sw_reserved_ia32 = fx_sw_reserved;
+ fx_sw_reserved_ia32.extended_size += fsave_header_size;
+ }
+}
+
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 336b3dae59ca..3629e2ef3c94 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -10,7 +10,7 @@
#include <asm/fpu/internal.h>
#include <asm/fpu/signal.h>
#include <asm/fpu/regset.h>
-#include <asm/sigframe.h>
+
#include <asm/tlbflush.h>

static const char *xfeature_names[] =
@@ -31,7 +31,6 @@ static const char *xfeature_names[] =
*/
u64 xfeatures_mask __read_mostly;

-static struct _fpx_sw_bytes fx_sw_reserved, fx_sw_reserved_ia32;
static unsigned int xstate_offsets[XFEATURES_NR_MAX], xstate_sizes[XFEATURES_NR_MAX];
static unsigned int xstate_comp_offsets[sizeof(xfeatures_mask)*8];

@@ -155,395 +154,6 @@ void fpstate_sanitize_xstate(struct fpu *fpu)
}

/*
- * Check for the presence of extended state information in the
- * user fpstate pointer in the sigcontext.
- */
-static inline int check_for_xstate(struct i387_fxsave_struct __user *buf,
- void __user *fpstate,
- struct _fpx_sw_bytes *fx_sw)
-{
- int min_xstate_size = sizeof(struct i387_fxsave_struct) +
- sizeof(struct xstate_header);
- unsigned int magic2;
-
- if (__copy_from_user(fx_sw, &buf->sw_reserved[0], sizeof(*fx_sw)))
- return -1;
-
- /* Check for the first magic field and other error scenarios. */
- if (fx_sw->magic1 != FP_XSTATE_MAGIC1 ||
- fx_sw->xstate_size < min_xstate_size ||
- fx_sw->xstate_size > xstate_size ||
- fx_sw->xstate_size > fx_sw->extended_size)
- return -1;
-
- /*
- * Check for the presence of second magic word at the end of memory
- * layout. This detects the case where the user just copied the legacy
- * fpstate layout with out copying the extended state information
- * in the memory layout.
- */
- if (__get_user(magic2, (__u32 __user *)(fpstate + fx_sw->xstate_size))
- || magic2 != FP_XSTATE_MAGIC2)
- return -1;
-
- return 0;
-}
-
-/*
- * Signal frame handlers.
- */
-static inline int save_fsave_header(struct task_struct *tsk, void __user *buf)
-{
- if (use_fxsr()) {
- struct xsave_struct *xsave = &tsk->thread.fpu.state.xsave;
- struct user_i387_ia32_struct env;
- struct _fpstate_ia32 __user *fp = buf;
-
- convert_from_fxsr(&env, tsk);
-
- if (__copy_to_user(buf, &env, sizeof(env)) ||
- __put_user(xsave->i387.swd, &fp->status) ||
- __put_user(X86_FXSR_MAGIC, &fp->magic))
- return -1;
- } else {
- struct i387_fsave_struct __user *fp = buf;
- u32 swd;
- if (__get_user(swd, &fp->swd) || __put_user(swd, &fp->status))
- return -1;
- }
-
- return 0;
-}
-
-static inline int save_xstate_epilog(void __user *buf, int ia32_frame)
-{
- struct xsave_struct __user *x = buf;
- struct _fpx_sw_bytes *sw_bytes;
- u32 xfeatures;
- int err;
-
- /* Setup the bytes not touched by the [f]xsave and reserved for SW. */
- sw_bytes = ia32_frame ? &fx_sw_reserved_ia32 : &fx_sw_reserved;
- err = __copy_to_user(&x->i387.sw_reserved, sw_bytes, sizeof(*sw_bytes));
-
- if (!use_xsave())
- return err;
-
- err |= __put_user(FP_XSTATE_MAGIC2, (__u32 *)(buf + xstate_size));
-
- /*
- * Read the xfeatures which we copied (directly from the cpu or
- * from the state in task struct) to the user buffers.
- */
- err |= __get_user(xfeatures, (__u32 *)&x->header.xfeatures);
-
- /*
- * For legacy compatible, we always set FP/SSE bits in the bit
- * vector while saving the state to the user context. This will
- * enable us capturing any changes(during sigreturn) to
- * the FP/SSE bits by the legacy applications which don't touch
- * xfeatures in the xsave header.
- *
- * xsave aware apps can change the xfeatures in the xsave
- * header as well as change any contents in the memory layout.
- * xrestore as part of sigreturn will capture all the changes.
- */
- xfeatures |= XSTATE_FPSSE;
-
- err |= __put_user(xfeatures, (__u32 *)&x->header.xfeatures);
-
- return err;
-}
-
-static inline int copy_fpregs_to_sigframe(struct xsave_struct __user *buf)
-{
- int err;
-
- if (use_xsave())
- err = copy_xregs_to_user(buf);
- else if (use_fxsr())
- err = copy_fxregs_to_user((struct i387_fxsave_struct __user *) buf);
- else
- err = copy_fregs_to_user((struct i387_fsave_struct __user *) buf);
-
- if (unlikely(err) && __clear_user(buf, xstate_size))
- err = -EFAULT;
- return err;
-}
-
-/*
- * Save the fpu, extended register state to the user signal frame.
- *
- * 'buf_fx' is the 64-byte aligned pointer at which the [f|fx|x]save
- * state is copied.
- * 'buf' points to the 'buf_fx' or to the fsave header followed by 'buf_fx'.
- *
- * buf == buf_fx for 64-bit frames and 32-bit fsave frame.
- * buf != buf_fx for 32-bit frames with fxstate.
- *
- * If the fpu, extended register state is live, save the state directly
- * to the user frame pointed by the aligned pointer 'buf_fx'. Otherwise,
- * copy the thread's fpu state to the user frame starting at 'buf_fx'.
- *
- * If this is a 32-bit frame with fxstate, put a fsave header before
- * the aligned state at 'buf_fx'.
- *
- * For [f]xsave state, update the SW reserved fields in the [f]xsave frame
- * indicating the absence/presence of the extended state to the user.
- */
-int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
-{
- struct xsave_struct *xsave = &current->thread.fpu.state.xsave;
- struct task_struct *tsk = current;
- int ia32_fxstate = (buf != buf_fx);
-
- ia32_fxstate &= (config_enabled(CONFIG_X86_32) ||
- config_enabled(CONFIG_IA32_EMULATION));
-
- if (!access_ok(VERIFY_WRITE, buf, size))
- return -EACCES;
-
- if (!static_cpu_has(X86_FEATURE_FPU))
- return fpregs_soft_get(current, NULL, 0,
- sizeof(struct user_i387_ia32_struct), NULL,
- (struct _fpstate_ia32 __user *) buf) ? -1 : 1;
-
- if (fpregs_active()) {
- /* Save the live register state to the user directly. */
- if (copy_fpregs_to_sigframe(buf_fx))
- return -1;
- /* Update the thread's fxstate to save the fsave header. */
- if (ia32_fxstate)
- copy_fxregs_to_kernel(&tsk->thread.fpu);
- } else {
- fpstate_sanitize_xstate(&tsk->thread.fpu);
- if (__copy_to_user(buf_fx, xsave, xstate_size))
- return -1;
- }
-
- /* Save the fsave header for the 32-bit frames. */
- if ((ia32_fxstate || !use_fxsr()) && save_fsave_header(tsk, buf))
- return -1;
-
- if (use_fxsr() && save_xstate_epilog(buf_fx, ia32_fxstate))
- return -1;
-
- return 0;
-}
-
-static inline void
-sanitize_restored_xstate(struct task_struct *tsk,
- struct user_i387_ia32_struct *ia32_env,
- u64 xfeatures, int fx_only)
-{
- struct xsave_struct *xsave = &tsk->thread.fpu.state.xsave;
- struct xstate_header *header = &xsave->header;
-
- if (use_xsave()) {
- /* These bits must be zero. */
- memset(header->reserved, 0, 48);
-
- /*
- * Init the state that is not present in the memory
- * layout and not enabled by the OS.
- */
- if (fx_only)
- header->xfeatures = XSTATE_FPSSE;
- else
- header->xfeatures &= (xfeatures_mask & xfeatures);
- }
-
- if (use_fxsr()) {
- /*
- * mscsr reserved bits must be masked to zero for security
- * reasons.
- */
- xsave->i387.mxcsr &= mxcsr_feature_mask;
-
- convert_to_fxsr(tsk, ia32_env);
- }
-}
-
-/*
- * Restore the extended state if present. Otherwise, restore the FP/SSE state.
- */
-static inline int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only)
-{
- if (use_xsave()) {
- if ((unsigned long)buf % 64 || fx_only) {
- u64 init_bv = xfeatures_mask & ~XSTATE_FPSSE;
- copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
- return copy_user_to_fxregs(buf);
- } else {
- u64 init_bv = xfeatures_mask & ~xbv;
- if (unlikely(init_bv))
- copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
- return copy_user_to_xregs(buf, xbv);
- }
- } else if (use_fxsr()) {
- return copy_user_to_fxregs(buf);
- } else
- return copy_user_to_fregs(buf);
-}
-
-static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
-{
- int ia32_fxstate = (buf != buf_fx);
- struct task_struct *tsk = current;
- struct fpu *fpu = &tsk->thread.fpu;
- int state_size = xstate_size;
- u64 xfeatures = 0;
- int fx_only = 0;
-
- ia32_fxstate &= (config_enabled(CONFIG_X86_32) ||
- config_enabled(CONFIG_IA32_EMULATION));
-
- if (!buf) {
- fpu__clear(fpu);
- return 0;
- }
-
- if (!access_ok(VERIFY_READ, buf, size))
- return -EACCES;
-
- fpu__activate_curr(fpu);
-
- if (!static_cpu_has(X86_FEATURE_FPU))
- return fpregs_soft_set(current, NULL,
- 0, sizeof(struct user_i387_ia32_struct),
- NULL, buf) != 0;
-
- if (use_xsave()) {
- struct _fpx_sw_bytes fx_sw_user;
- if (unlikely(check_for_xstate(buf_fx, buf_fx, &fx_sw_user))) {
- /*
- * Couldn't find the extended state information in the
- * memory layout. Restore just the FP/SSE and init all
- * the other extended state.
- */
- state_size = sizeof(struct i387_fxsave_struct);
- fx_only = 1;
- } else {
- state_size = fx_sw_user.xstate_size;
- xfeatures = fx_sw_user.xfeatures;
- }
- }
-
- if (ia32_fxstate) {
- /*
- * For 32-bit frames with fxstate, copy the user state to the
- * thread's fpu state, reconstruct fxstate from the fsave
- * header. Sanitize the copied state etc.
- */
- struct fpu *fpu = &tsk->thread.fpu;
- struct user_i387_ia32_struct env;
- int err = 0;
-
- /*
- * Drop the current fpu which clears fpu->fpstate_active. This ensures
- * that any context-switch during the copy of the new state,
- * avoids the intermediate state from getting restored/saved.
- * Thus avoiding the new restored state from getting corrupted.
- * We will be ready to restore/save the state only after
- * fpu->fpstate_active is again set.
- */
- fpu__drop(fpu);
-
- if (__copy_from_user(&fpu->state.xsave, buf_fx, state_size) ||
- __copy_from_user(&env, buf, sizeof(env))) {
- fpstate_init(&fpu->state);
- err = -1;
- } else {
- sanitize_restored_xstate(tsk, &env, xfeatures, fx_only);
- }
-
- fpu->fpstate_active = 1;
- if (use_eager_fpu()) {
- preempt_disable();
- fpu__restore();
- preempt_enable();
- }
-
- return err;
- } else {
- /*
- * For 64-bit frames and 32-bit fsave frames, restore the user
- * state to the registers directly (with exceptions handled).
- */
- user_fpu_begin();
- if (copy_user_to_fpregs_zeroing(buf_fx, xfeatures, fx_only)) {
- fpu__clear(fpu);
- return -1;
- }
- }
-
- return 0;
-}
-
-static inline int xstate_sigframe_size(void)
-{
- return use_xsave() ? xstate_size + FP_XSTATE_MAGIC2_SIZE : xstate_size;
-}
-
-/*
- * Restore FPU state from a sigframe:
- */
-int fpu__restore_sig(void __user *buf, int ia32_frame)
-{
- void __user *buf_fx = buf;
- int size = xstate_sigframe_size();
-
- if (ia32_frame && use_fxsr()) {
- buf_fx = buf + sizeof(struct i387_fsave_struct);
- size += sizeof(struct i387_fsave_struct);
- }
-
- return __fpu__restore_sig(buf, buf_fx, size);
-}
-
-unsigned long
-fpu__alloc_mathframe(unsigned long sp, int ia32_frame,
- unsigned long *buf_fx, unsigned long *size)
-{
- unsigned long frame_size = xstate_sigframe_size();
-
- *buf_fx = sp = round_down(sp - frame_size, 64);
- if (ia32_frame && use_fxsr()) {
- frame_size += sizeof(struct i387_fsave_struct);
- sp -= sizeof(struct i387_fsave_struct);
- }
-
- *size = frame_size;
-
- return sp;
-}
-/*
- * Prepare the SW reserved portion of the fxsave memory layout, indicating
- * the presence of the extended state information in the memory layout
- * pointed by the fpstate pointer in the sigcontext.
- * This will be saved when ever the FP and extended state context is
- * saved on the user stack during the signal handler delivery to the user.
- */
-static void prepare_fx_sw_frame(void)
-{
- int fsave_header_size = sizeof(struct i387_fsave_struct);
- int size = xstate_size + FP_XSTATE_MAGIC2_SIZE;
-
- if (config_enabled(CONFIG_X86_32))
- size += fsave_header_size;
-
- fx_sw_reserved.magic1 = FP_XSTATE_MAGIC1;
- fx_sw_reserved.extended_size = size;
- fx_sw_reserved.xfeatures = xfeatures_mask;
- fx_sw_reserved.xstate_size = xstate_size;
-
- if (config_enabled(CONFIG_IA32_EMULATION)) {
- fx_sw_reserved_ia32 = fx_sw_reserved;
- fx_sw_reserved_ia32.extended_size += fsave_header_size;
- }
-}
-
-/*
* Enable the extended processor state save/restore feature.
* Called once per CPU onlining.
*/
@@ -741,7 +351,7 @@ void fpu__init_system_xstate(void)
init_xstate_size();

update_regset_xstate_info(xstate_size, xfeatures_mask);
- prepare_fx_sw_frame();
+ fpu__init_prepare_fx_sw_frame();
setup_init_fpu_buf();

pr_info("x86/fpu: Enabled xstate features 0x%llx, context size is 0x%x bytes, using '%s' format.\n",
--
2.1.0

2015-05-05 17:59:56

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 195/208] x86/fpu: Factor out the FPU regset code into fpu/regset.c

So much of fpu/core.c is the regset code, but it just obscures the generic
FPU state machine logic. Factor out the regset code into fpu/regset.c, where
it can be read in isolation.

This affects one API: fpu__activate_stopped() has to be made available
from the core to fpu/regset.c.

No change in functionality.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/fpu/internal.h | 4 +-
arch/x86/kernel/fpu/Makefile | 2 +-
arch/x86/kernel/fpu/core.c | 352 +------------------------------------------------------
arch/x86/kernel/fpu/regset.c | 356 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 360 insertions(+), 354 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index f23ea10d3a1f..db6c24ba6d3d 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -29,8 +29,6 @@ extern void fpu__init_system_xstate(void);
extern void fpu__init_cpu_xstate(void);
extern void fpu__init_system(struct cpuinfo_x86 *c);

-extern void fpu__activate_curr(struct fpu *fpu);
-
extern void fpstate_init(union thread_xstate *state);
#ifdef CONFIG_MATH_EMULATION
extern void fpstate_init_soft(struct i387_soft_struct *soft);
@@ -49,6 +47,8 @@ extern int fpu__exception_code(struct fpu *fpu, int trap_nr);
/*
* High level FPU state handling functions:
*/
+extern void fpu__activate_curr(struct fpu *fpu);
+extern void fpu__activate_stopped(struct fpu *fpu);
extern void fpu__save(struct fpu *fpu);
extern void fpu__restore(void);
extern int fpu__restore_sig(void __user *buf, int ia32_frame);
diff --git a/arch/x86/kernel/fpu/Makefile b/arch/x86/kernel/fpu/Makefile
index 5c697ded57f2..68279efb811a 100644
--- a/arch/x86/kernel/fpu/Makefile
+++ b/arch/x86/kernel/fpu/Makefile
@@ -2,4 +2,4 @@
# Build rules for the FPU support code:
#

-obj-y += init.o bugs.o core.o signal.o xstate.o
+obj-y += init.o bugs.o core.o regset.o signal.o xstate.o
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 02eaec4722ba..f3443b9fb7d8 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -320,7 +320,7 @@ EXPORT_SYMBOL_GPL(fpu__activate_curr);
* the read-only case, it's not strictly necessary for
* read-only access to the context.
*/
-static void fpu__activate_stopped(struct fpu *child_fpu)
+void fpu__activate_stopped(struct fpu *child_fpu)
{
WARN_ON_ONCE(child_fpu == &current->thread.fpu);

@@ -426,356 +426,6 @@ void fpu__clear(struct fpu *fpu)
}

/*
- * The xstateregs_active() routine is the same as the regset_fpregs_active() routine,
- * as the "regset->n" for the xstate regset will be updated based on the feature
- * capabilites supported by the xsave.
- */
-int regset_fpregs_active(struct task_struct *target, const struct user_regset *regset)
-{
- struct fpu *target_fpu = &target->thread.fpu;
-
- return target_fpu->fpstate_active ? regset->n : 0;
-}
-
-int regset_xregset_fpregs_active(struct task_struct *target, const struct user_regset *regset)
-{
- struct fpu *target_fpu = &target->thread.fpu;
-
- return (cpu_has_fxsr && target_fpu->fpstate_active) ? regset->n : 0;
-}
-
-int xfpregs_get(struct task_struct *target, const struct user_regset *regset,
- unsigned int pos, unsigned int count,
- void *kbuf, void __user *ubuf)
-{
- struct fpu *fpu = &target->thread.fpu;
-
- if (!cpu_has_fxsr)
- return -ENODEV;
-
- fpu__activate_stopped(fpu);
- fpstate_sanitize_xstate(fpu);
-
- return user_regset_copyout(&pos, &count, &kbuf, &ubuf,
- &fpu->state.fxsave, 0, -1);
-}
-
-int xfpregs_set(struct task_struct *target, const struct user_regset *regset,
- unsigned int pos, unsigned int count,
- const void *kbuf, const void __user *ubuf)
-{
- struct fpu *fpu = &target->thread.fpu;
- int ret;
-
- if (!cpu_has_fxsr)
- return -ENODEV;
-
- fpu__activate_stopped(fpu);
- fpstate_sanitize_xstate(fpu);
-
- ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
- &fpu->state.fxsave, 0, -1);
-
- /*
- * mxcsr reserved bits must be masked to zero for security reasons.
- */
- fpu->state.fxsave.mxcsr &= mxcsr_feature_mask;
-
- /*
- * update the header bits in the xsave header, indicating the
- * presence of FP and SSE state.
- */
- if (cpu_has_xsave)
- fpu->state.xsave.header.xfeatures |= XSTATE_FPSSE;
-
- return ret;
-}
-
-int xstateregs_get(struct task_struct *target, const struct user_regset *regset,
- unsigned int pos, unsigned int count,
- void *kbuf, void __user *ubuf)
-{
- struct fpu *fpu = &target->thread.fpu;
- struct xsave_struct *xsave;
- int ret;
-
- if (!cpu_has_xsave)
- return -ENODEV;
-
- fpu__activate_stopped(fpu);
-
- xsave = &fpu->state.xsave;
-
- /*
- * Copy the 48bytes defined by the software first into the xstate
- * memory layout in the thread struct, so that we can copy the entire
- * xstateregs to the user using one user_regset_copyout().
- */
- memcpy(&xsave->i387.sw_reserved,
- xstate_fx_sw_bytes, sizeof(xstate_fx_sw_bytes));
- /*
- * Copy the xstate memory layout.
- */
- ret = user_regset_copyout(&pos, &count, &kbuf, &ubuf, xsave, 0, -1);
- return ret;
-}
-
-int xstateregs_set(struct task_struct *target, const struct user_regset *regset,
- unsigned int pos, unsigned int count,
- const void *kbuf, const void __user *ubuf)
-{
- struct fpu *fpu = &target->thread.fpu;
- struct xsave_struct *xsave;
- int ret;
-
- if (!cpu_has_xsave)
- return -ENODEV;
-
- fpu__activate_stopped(fpu);
-
- xsave = &fpu->state.xsave;
-
- ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, xsave, 0, -1);
- /*
- * mxcsr reserved bits must be masked to zero for security reasons.
- */
- xsave->i387.mxcsr &= mxcsr_feature_mask;
- xsave->header.xfeatures &= xfeatures_mask;
- /*
- * These bits must be zero.
- */
- memset(&xsave->header.reserved, 0, 48);
-
- return ret;
-}
-
-#if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION
-
-/*
- * FPU tag word conversions.
- */
-
-static inline unsigned short twd_i387_to_fxsr(unsigned short twd)
-{
- unsigned int tmp; /* to avoid 16 bit prefixes in the code */
-
- /* Transform each pair of bits into 01 (valid) or 00 (empty) */
- tmp = ~twd;
- tmp = (tmp | (tmp>>1)) & 0x5555; /* 0V0V0V0V0V0V0V0V */
- /* and move the valid bits to the lower byte. */
- tmp = (tmp | (tmp >> 1)) & 0x3333; /* 00VV00VV00VV00VV */
- tmp = (tmp | (tmp >> 2)) & 0x0f0f; /* 0000VVVV0000VVVV */
- tmp = (tmp | (tmp >> 4)) & 0x00ff; /* 00000000VVVVVVVV */
-
- return tmp;
-}
-
-#define FPREG_ADDR(f, n) ((void *)&(f)->st_space + (n) * 16)
-#define FP_EXP_TAG_VALID 0
-#define FP_EXP_TAG_ZERO 1
-#define FP_EXP_TAG_SPECIAL 2
-#define FP_EXP_TAG_EMPTY 3
-
-static inline u32 twd_fxsr_to_i387(struct i387_fxsave_struct *fxsave)
-{
- struct _fpxreg *st;
- u32 tos = (fxsave->swd >> 11) & 7;
- u32 twd = (unsigned long) fxsave->twd;
- u32 tag;
- u32 ret = 0xffff0000u;
- int i;
-
- for (i = 0; i < 8; i++, twd >>= 1) {
- if (twd & 0x1) {
- st = FPREG_ADDR(fxsave, (i - tos) & 7);
-
- switch (st->exponent & 0x7fff) {
- case 0x7fff:
- tag = FP_EXP_TAG_SPECIAL;
- break;
- case 0x0000:
- if (!st->significand[0] &&
- !st->significand[1] &&
- !st->significand[2] &&
- !st->significand[3])
- tag = FP_EXP_TAG_ZERO;
- else
- tag = FP_EXP_TAG_SPECIAL;
- break;
- default:
- if (st->significand[3] & 0x8000)
- tag = FP_EXP_TAG_VALID;
- else
- tag = FP_EXP_TAG_SPECIAL;
- break;
- }
- } else {
- tag = FP_EXP_TAG_EMPTY;
- }
- ret |= tag << (2 * i);
- }
- return ret;
-}
-
-/*
- * FXSR floating point environment conversions.
- */
-
-void
-convert_from_fxsr(struct user_i387_ia32_struct *env, struct task_struct *tsk)
-{
- struct i387_fxsave_struct *fxsave = &tsk->thread.fpu.state.fxsave;
- struct _fpreg *to = (struct _fpreg *) &env->st_space[0];
- struct _fpxreg *from = (struct _fpxreg *) &fxsave->st_space[0];
- int i;
-
- env->cwd = fxsave->cwd | 0xffff0000u;
- env->swd = fxsave->swd | 0xffff0000u;
- env->twd = twd_fxsr_to_i387(fxsave);
-
-#ifdef CONFIG_X86_64
- env->fip = fxsave->rip;
- env->foo = fxsave->rdp;
- /*
- * should be actually ds/cs at fpu exception time, but
- * that information is not available in 64bit mode.
- */
- env->fcs = task_pt_regs(tsk)->cs;
- if (tsk == current) {
- savesegment(ds, env->fos);
- } else {
- env->fos = tsk->thread.ds;
- }
- env->fos |= 0xffff0000;
-#else
- env->fip = fxsave->fip;
- env->fcs = (u16) fxsave->fcs | ((u32) fxsave->fop << 16);
- env->foo = fxsave->foo;
- env->fos = fxsave->fos;
-#endif
-
- for (i = 0; i < 8; ++i)
- memcpy(&to[i], &from[i], sizeof(to[0]));
-}
-
-void convert_to_fxsr(struct task_struct *tsk,
- const struct user_i387_ia32_struct *env)
-
-{
- struct i387_fxsave_struct *fxsave = &tsk->thread.fpu.state.fxsave;
- struct _fpreg *from = (struct _fpreg *) &env->st_space[0];
- struct _fpxreg *to = (struct _fpxreg *) &fxsave->st_space[0];
- int i;
-
- fxsave->cwd = env->cwd;
- fxsave->swd = env->swd;
- fxsave->twd = twd_i387_to_fxsr(env->twd);
- fxsave->fop = (u16) ((u32) env->fcs >> 16);
-#ifdef CONFIG_X86_64
- fxsave->rip = env->fip;
- fxsave->rdp = env->foo;
- /* cs and ds ignored */
-#else
- fxsave->fip = env->fip;
- fxsave->fcs = (env->fcs & 0xffff);
- fxsave->foo = env->foo;
- fxsave->fos = env->fos;
-#endif
-
- for (i = 0; i < 8; ++i)
- memcpy(&to[i], &from[i], sizeof(from[0]));
-}
-
-int fpregs_get(struct task_struct *target, const struct user_regset *regset,
- unsigned int pos, unsigned int count,
- void *kbuf, void __user *ubuf)
-{
- struct fpu *fpu = &target->thread.fpu;
- struct user_i387_ia32_struct env;
-
- fpu__activate_stopped(fpu);
-
- if (!static_cpu_has(X86_FEATURE_FPU))
- return fpregs_soft_get(target, regset, pos, count, kbuf, ubuf);
-
- if (!cpu_has_fxsr)
- return user_regset_copyout(&pos, &count, &kbuf, &ubuf,
- &fpu->state.fsave, 0,
- -1);
-
- fpstate_sanitize_xstate(fpu);
-
- if (kbuf && pos == 0 && count == sizeof(env)) {
- convert_from_fxsr(kbuf, target);
- return 0;
- }
-
- convert_from_fxsr(&env, target);
-
- return user_regset_copyout(&pos, &count, &kbuf, &ubuf, &env, 0, -1);
-}
-
-int fpregs_set(struct task_struct *target, const struct user_regset *regset,
- unsigned int pos, unsigned int count,
- const void *kbuf, const void __user *ubuf)
-{
- struct fpu *fpu = &target->thread.fpu;
- struct user_i387_ia32_struct env;
- int ret;
-
- fpu__activate_stopped(fpu);
- fpstate_sanitize_xstate(fpu);
-
- if (!static_cpu_has(X86_FEATURE_FPU))
- return fpregs_soft_set(target, regset, pos, count, kbuf, ubuf);
-
- if (!cpu_has_fxsr)
- return user_regset_copyin(&pos, &count, &kbuf, &ubuf,
- &fpu->state.fsave, 0,
- -1);
-
- if (pos > 0 || count < sizeof(env))
- convert_from_fxsr(&env, target);
-
- ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &env, 0, -1);
- if (!ret)
- convert_to_fxsr(target, &env);
-
- /*
- * update the header bit in the xsave header, indicating the
- * presence of FP.
- */
- if (cpu_has_xsave)
- fpu->state.xsave.header.xfeatures |= XSTATE_FP;
- return ret;
-}
-
-/*
- * FPU state for core dumps.
- * This is only used for a.out dumps now.
- * It is declared generically using elf_fpregset_t (which is
- * struct user_i387_struct) but is in fact only used for 32-bit
- * dumps, so on 64-bit it is really struct user_i387_ia32_struct.
- */
-int dump_fpu(struct pt_regs *regs, struct user_i387_struct *ufpu)
-{
- struct task_struct *tsk = current;
- struct fpu *fpu = &tsk->thread.fpu;
- int fpvalid;
-
- fpvalid = fpu->fpstate_active;
- if (fpvalid)
- fpvalid = !fpregs_get(tsk, NULL,
- 0, sizeof(struct user_i387_ia32_struct),
- ufpu, NULL);
-
- return fpvalid;
-}
-EXPORT_SYMBOL(dump_fpu);
-
-#endif /* CONFIG_X86_32 || CONFIG_IA32_EMULATION */
-
-/*
* x87 math exception handling:
*/

diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
new file mode 100644
index 000000000000..1f58a1c2a941
--- /dev/null
+++ b/arch/x86/kernel/fpu/regset.c
@@ -0,0 +1,356 @@
+/*
+ * FPU register's regset abstraction, for ptrace, core dumps, etc.
+ */
+#include <asm/fpu/internal.h>
+#include <asm/fpu/signal.h>
+#include <asm/fpu/regset.h>
+
+/*
+ * The xstateregs_active() routine is the same as the regset_fpregs_active() routine,
+ * as the "regset->n" for the xstate regset will be updated based on the feature
+ * capabilites supported by the xsave.
+ */
+int regset_fpregs_active(struct task_struct *target, const struct user_regset *regset)
+{
+ struct fpu *target_fpu = &target->thread.fpu;
+
+ return target_fpu->fpstate_active ? regset->n : 0;
+}
+
+int regset_xregset_fpregs_active(struct task_struct *target, const struct user_regset *regset)
+{
+ struct fpu *target_fpu = &target->thread.fpu;
+
+ return (cpu_has_fxsr && target_fpu->fpstate_active) ? regset->n : 0;
+}
+
+int xfpregs_get(struct task_struct *target, const struct user_regset *regset,
+ unsigned int pos, unsigned int count,
+ void *kbuf, void __user *ubuf)
+{
+ struct fpu *fpu = &target->thread.fpu;
+
+ if (!cpu_has_fxsr)
+ return -ENODEV;
+
+ fpu__activate_stopped(fpu);
+ fpstate_sanitize_xstate(fpu);
+
+ return user_regset_copyout(&pos, &count, &kbuf, &ubuf,
+ &fpu->state.fxsave, 0, -1);
+}
+
+int xfpregs_set(struct task_struct *target, const struct user_regset *regset,
+ unsigned int pos, unsigned int count,
+ const void *kbuf, const void __user *ubuf)
+{
+ struct fpu *fpu = &target->thread.fpu;
+ int ret;
+
+ if (!cpu_has_fxsr)
+ return -ENODEV;
+
+ fpu__activate_stopped(fpu);
+ fpstate_sanitize_xstate(fpu);
+
+ ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
+ &fpu->state.fxsave, 0, -1);
+
+ /*
+ * mxcsr reserved bits must be masked to zero for security reasons.
+ */
+ fpu->state.fxsave.mxcsr &= mxcsr_feature_mask;
+
+ /*
+ * update the header bits in the xsave header, indicating the
+ * presence of FP and SSE state.
+ */
+ if (cpu_has_xsave)
+ fpu->state.xsave.header.xfeatures |= XSTATE_FPSSE;
+
+ return ret;
+}
+
+int xstateregs_get(struct task_struct *target, const struct user_regset *regset,
+ unsigned int pos, unsigned int count,
+ void *kbuf, void __user *ubuf)
+{
+ struct fpu *fpu = &target->thread.fpu;
+ struct xsave_struct *xsave;
+ int ret;
+
+ if (!cpu_has_xsave)
+ return -ENODEV;
+
+ fpu__activate_stopped(fpu);
+
+ xsave = &fpu->state.xsave;
+
+ /*
+ * Copy the 48bytes defined by the software first into the xstate
+ * memory layout in the thread struct, so that we can copy the entire
+ * xstateregs to the user using one user_regset_copyout().
+ */
+ memcpy(&xsave->i387.sw_reserved,
+ xstate_fx_sw_bytes, sizeof(xstate_fx_sw_bytes));
+ /*
+ * Copy the xstate memory layout.
+ */
+ ret = user_regset_copyout(&pos, &count, &kbuf, &ubuf, xsave, 0, -1);
+ return ret;
+}
+
+int xstateregs_set(struct task_struct *target, const struct user_regset *regset,
+ unsigned int pos, unsigned int count,
+ const void *kbuf, const void __user *ubuf)
+{
+ struct fpu *fpu = &target->thread.fpu;
+ struct xsave_struct *xsave;
+ int ret;
+
+ if (!cpu_has_xsave)
+ return -ENODEV;
+
+ fpu__activate_stopped(fpu);
+
+ xsave = &fpu->state.xsave;
+
+ ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, xsave, 0, -1);
+ /*
+ * mxcsr reserved bits must be masked to zero for security reasons.
+ */
+ xsave->i387.mxcsr &= mxcsr_feature_mask;
+ xsave->header.xfeatures &= xfeatures_mask;
+ /*
+ * These bits must be zero.
+ */
+ memset(&xsave->header.reserved, 0, 48);
+
+ return ret;
+}
+
+#if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION
+
+/*
+ * FPU tag word conversions.
+ */
+
+static inline unsigned short twd_i387_to_fxsr(unsigned short twd)
+{
+ unsigned int tmp; /* to avoid 16 bit prefixes in the code */
+
+ /* Transform each pair of bits into 01 (valid) or 00 (empty) */
+ tmp = ~twd;
+ tmp = (tmp | (tmp>>1)) & 0x5555; /* 0V0V0V0V0V0V0V0V */
+ /* and move the valid bits to the lower byte. */
+ tmp = (tmp | (tmp >> 1)) & 0x3333; /* 00VV00VV00VV00VV */
+ tmp = (tmp | (tmp >> 2)) & 0x0f0f; /* 0000VVVV0000VVVV */
+ tmp = (tmp | (tmp >> 4)) & 0x00ff; /* 00000000VVVVVVVV */
+
+ return tmp;
+}
+
+#define FPREG_ADDR(f, n) ((void *)&(f)->st_space + (n) * 16)
+#define FP_EXP_TAG_VALID 0
+#define FP_EXP_TAG_ZERO 1
+#define FP_EXP_TAG_SPECIAL 2
+#define FP_EXP_TAG_EMPTY 3
+
+static inline u32 twd_fxsr_to_i387(struct i387_fxsave_struct *fxsave)
+{
+ struct _fpxreg *st;
+ u32 tos = (fxsave->swd >> 11) & 7;
+ u32 twd = (unsigned long) fxsave->twd;
+ u32 tag;
+ u32 ret = 0xffff0000u;
+ int i;
+
+ for (i = 0; i < 8; i++, twd >>= 1) {
+ if (twd & 0x1) {
+ st = FPREG_ADDR(fxsave, (i - tos) & 7);
+
+ switch (st->exponent & 0x7fff) {
+ case 0x7fff:
+ tag = FP_EXP_TAG_SPECIAL;
+ break;
+ case 0x0000:
+ if (!st->significand[0] &&
+ !st->significand[1] &&
+ !st->significand[2] &&
+ !st->significand[3])
+ tag = FP_EXP_TAG_ZERO;
+ else
+ tag = FP_EXP_TAG_SPECIAL;
+ break;
+ default:
+ if (st->significand[3] & 0x8000)
+ tag = FP_EXP_TAG_VALID;
+ else
+ tag = FP_EXP_TAG_SPECIAL;
+ break;
+ }
+ } else {
+ tag = FP_EXP_TAG_EMPTY;
+ }
+ ret |= tag << (2 * i);
+ }
+ return ret;
+}
+
+/*
+ * FXSR floating point environment conversions.
+ */
+
+void
+convert_from_fxsr(struct user_i387_ia32_struct *env, struct task_struct *tsk)
+{
+ struct i387_fxsave_struct *fxsave = &tsk->thread.fpu.state.fxsave;
+ struct _fpreg *to = (struct _fpreg *) &env->st_space[0];
+ struct _fpxreg *from = (struct _fpxreg *) &fxsave->st_space[0];
+ int i;
+
+ env->cwd = fxsave->cwd | 0xffff0000u;
+ env->swd = fxsave->swd | 0xffff0000u;
+ env->twd = twd_fxsr_to_i387(fxsave);
+
+#ifdef CONFIG_X86_64
+ env->fip = fxsave->rip;
+ env->foo = fxsave->rdp;
+ /*
+ * should be actually ds/cs at fpu exception time, but
+ * that information is not available in 64bit mode.
+ */
+ env->fcs = task_pt_regs(tsk)->cs;
+ if (tsk == current) {
+ savesegment(ds, env->fos);
+ } else {
+ env->fos = tsk->thread.ds;
+ }
+ env->fos |= 0xffff0000;
+#else
+ env->fip = fxsave->fip;
+ env->fcs = (u16) fxsave->fcs | ((u32) fxsave->fop << 16);
+ env->foo = fxsave->foo;
+ env->fos = fxsave->fos;
+#endif
+
+ for (i = 0; i < 8; ++i)
+ memcpy(&to[i], &from[i], sizeof(to[0]));
+}
+
+void convert_to_fxsr(struct task_struct *tsk,
+ const struct user_i387_ia32_struct *env)
+
+{
+ struct i387_fxsave_struct *fxsave = &tsk->thread.fpu.state.fxsave;
+ struct _fpreg *from = (struct _fpreg *) &env->st_space[0];
+ struct _fpxreg *to = (struct _fpxreg *) &fxsave->st_space[0];
+ int i;
+
+ fxsave->cwd = env->cwd;
+ fxsave->swd = env->swd;
+ fxsave->twd = twd_i387_to_fxsr(env->twd);
+ fxsave->fop = (u16) ((u32) env->fcs >> 16);
+#ifdef CONFIG_X86_64
+ fxsave->rip = env->fip;
+ fxsave->rdp = env->foo;
+ /* cs and ds ignored */
+#else
+ fxsave->fip = env->fip;
+ fxsave->fcs = (env->fcs & 0xffff);
+ fxsave->foo = env->foo;
+ fxsave->fos = env->fos;
+#endif
+
+ for (i = 0; i < 8; ++i)
+ memcpy(&to[i], &from[i], sizeof(from[0]));
+}
+
+int fpregs_get(struct task_struct *target, const struct user_regset *regset,
+ unsigned int pos, unsigned int count,
+ void *kbuf, void __user *ubuf)
+{
+ struct fpu *fpu = &target->thread.fpu;
+ struct user_i387_ia32_struct env;
+
+ fpu__activate_stopped(fpu);
+
+ if (!static_cpu_has(X86_FEATURE_FPU))
+ return fpregs_soft_get(target, regset, pos, count, kbuf, ubuf);
+
+ if (!cpu_has_fxsr)
+ return user_regset_copyout(&pos, &count, &kbuf, &ubuf,
+ &fpu->state.fsave, 0,
+ -1);
+
+ fpstate_sanitize_xstate(fpu);
+
+ if (kbuf && pos == 0 && count == sizeof(env)) {
+ convert_from_fxsr(kbuf, target);
+ return 0;
+ }
+
+ convert_from_fxsr(&env, target);
+
+ return user_regset_copyout(&pos, &count, &kbuf, &ubuf, &env, 0, -1);
+}
+
+int fpregs_set(struct task_struct *target, const struct user_regset *regset,
+ unsigned int pos, unsigned int count,
+ const void *kbuf, const void __user *ubuf)
+{
+ struct fpu *fpu = &target->thread.fpu;
+ struct user_i387_ia32_struct env;
+ int ret;
+
+ fpu__activate_stopped(fpu);
+ fpstate_sanitize_xstate(fpu);
+
+ if (!static_cpu_has(X86_FEATURE_FPU))
+ return fpregs_soft_set(target, regset, pos, count, kbuf, ubuf);
+
+ if (!cpu_has_fxsr)
+ return user_regset_copyin(&pos, &count, &kbuf, &ubuf,
+ &fpu->state.fsave, 0,
+ -1);
+
+ if (pos > 0 || count < sizeof(env))
+ convert_from_fxsr(&env, target);
+
+ ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &env, 0, -1);
+ if (!ret)
+ convert_to_fxsr(target, &env);
+
+ /*
+ * update the header bit in the xsave header, indicating the
+ * presence of FP.
+ */
+ if (cpu_has_xsave)
+ fpu->state.xsave.header.xfeatures |= XSTATE_FP;
+ return ret;
+}
+
+/*
+ * FPU state for core dumps.
+ * This is only used for a.out dumps now.
+ * It is declared generically using elf_fpregset_t (which is
+ * struct user_i387_struct) but is in fact only used for 32-bit
+ * dumps, so on 64-bit it is really struct user_i387_ia32_struct.
+ */
+int dump_fpu(struct pt_regs *regs, struct user_i387_struct *ufpu)
+{
+ struct task_struct *tsk = current;
+ struct fpu *fpu = &tsk->thread.fpu;
+ int fpvalid;
+
+ fpvalid = fpu->fpstate_active;
+ if (fpvalid)
+ fpvalid = !fpregs_get(tsk, NULL,
+ 0, sizeof(struct user_i387_ia32_struct),
+ ufpu, NULL);
+
+ return fpvalid;
+}
+EXPORT_SYMBOL(dump_fpu);
+
+#endif /* CONFIG_X86_32 || CONFIG_IA32_EMULATION */
--
2.1.0

2015-05-05 18:04:21

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 196/208] x86/fpu: Harmonize FPU register state types

Use these consistent names:

struct fregs_state # was: i387_fsave_struct
struct fxregs_state # was: i387_fxsave_struct
struct swregs_state # was: i387_soft_struct
struct xregs_state # was: xsave_struct
union fpregs_state # was: thread_xstate

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/fpu/internal.h | 22 +++++++++++-----------
arch/x86/include/asm/fpu/types.h | 24 ++++++++++++------------
arch/x86/include/asm/fpu/xstate.h | 16 ++++++++--------
arch/x86/include/asm/mpx.h | 8 ++++----
arch/x86/kernel/fpu/core.c | 6 +++---
arch/x86/kernel/fpu/init.c | 8 ++++----
arch/x86/kernel/fpu/regset.c | 10 +++++-----
arch/x86/kernel/fpu/signal.c | 32 ++++++++++++++++----------------
arch/x86/kernel/fpu/xstate.c | 6 +++---
arch/x86/kernel/traps.c | 2 +-
arch/x86/kvm/x86.c | 12 ++++++------
arch/x86/math-emu/fpu_aux.c | 2 +-
arch/x86/math-emu/fpu_entry.c | 10 +++++-----
arch/x86/mm/mpx.c | 6 +++---
14 files changed, 82 insertions(+), 82 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index db6c24ba6d3d..7fdc90b9dd86 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -22,20 +22,20 @@

extern unsigned int mxcsr_feature_mask;

-extern union thread_xstate init_fpstate;
+extern union fpregs_state init_fpstate;

extern void fpu__init_cpu(void);
extern void fpu__init_system_xstate(void);
extern void fpu__init_cpu_xstate(void);
extern void fpu__init_system(struct cpuinfo_x86 *c);

-extern void fpstate_init(union thread_xstate *state);
+extern void fpstate_init(union fpregs_state *state);
#ifdef CONFIG_MATH_EMULATION
-extern void fpstate_init_soft(struct i387_soft_struct *soft);
+extern void fpstate_init_soft(struct swregs_state *soft);
#else
-static inline void fpstate_init_soft(struct i387_soft_struct *soft) {}
+static inline void fpstate_init_soft(struct swregs_state *soft) {}
#endif
-static inline void fpstate_init_fxstate(struct i387_fxsave_struct *fx)
+static inline void fpstate_init_fxstate(struct fxregs_state *fx)
{
fx->cwd = 0x37f;
fx->mxcsr = MXCSR_DEFAULT;
@@ -133,12 +133,12 @@ extern void fpstate_sanitize_xstate(struct fpu *fpu);
err; \
})

-static inline int copy_fregs_to_user(struct i387_fsave_struct __user *fx)
+static inline int copy_fregs_to_user(struct fregs_state __user *fx)
{
return user_insn(fnsave %[fx]; fwait, [fx] "=m" (*fx), "m" (*fx));
}

-static inline int copy_fxregs_to_user(struct i387_fxsave_struct __user *fx)
+static inline int copy_fxregs_to_user(struct fxregs_state __user *fx)
{
if (config_enabled(CONFIG_X86_32))
return user_insn(fxsave %[fx], [fx] "=m" (*fx), "m" (*fx));
@@ -149,7 +149,7 @@ static inline int copy_fxregs_to_user(struct i387_fxsave_struct __user *fx)
return user_insn(rex64/fxsave (%[fx]), "=m" (*fx), [fx] "R" (fx));
}

-static inline int copy_kernel_to_fxregs(struct i387_fxsave_struct *fx)
+static inline int copy_kernel_to_fxregs(struct fxregs_state *fx)
{
if (config_enabled(CONFIG_X86_32))
return check_insn(fxrstor %[fx], "=m" (*fx), [fx] "m" (*fx));
@@ -161,7 +161,7 @@ static inline int copy_kernel_to_fxregs(struct i387_fxsave_struct *fx)
"m" (*fx));
}

-static inline int copy_user_to_fxregs(struct i387_fxsave_struct __user *fx)
+static inline int copy_user_to_fxregs(struct fxregs_state __user *fx)
{
if (config_enabled(CONFIG_X86_32))
return user_insn(fxrstor %[fx], "=m" (*fx), [fx] "m" (*fx));
@@ -173,12 +173,12 @@ static inline int copy_user_to_fxregs(struct i387_fxsave_struct __user *fx)
"m" (*fx));
}

-static inline int copy_kernel_to_fregs(struct i387_fsave_struct *fx)
+static inline int copy_kernel_to_fregs(struct fregs_state *fx)
{
return check_insn(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));
}

-static inline int copy_user_to_fregs(struct i387_fsave_struct __user *fx)
+static inline int copy_user_to_fregs(struct fregs_state __user *fx)
{
return user_insn(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));
}
diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index 006ec2975f6f..fe2ce3276a38 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -4,7 +4,7 @@
#ifndef _ASM_X86_FPU_H
#define _ASM_X86_FPU_H

-struct i387_fsave_struct {
+struct fregs_state {
u32 cwd; /* FPU Control Word */
u32 swd; /* FPU Status Word */
u32 twd; /* FPU Tag Word */
@@ -20,7 +20,7 @@ struct i387_fsave_struct {
u32 status;
};

-struct i387_fxsave_struct {
+struct fxregs_state {
u16 cwd; /* Control Word */
u16 swd; /* Status Word */
u16 twd; /* Tag Word */
@@ -58,7 +58,7 @@ struct i387_fxsave_struct {
/*
* Software based FPU emulation state:
*/
-struct i387_soft_struct {
+struct swregs_state {
u32 cwd;
u32 swd;
u32 twd;
@@ -109,7 +109,7 @@ enum xfeature_bit {
/*
* There are 16x 256-bit AVX registers named YMM0-YMM15.
* The low 128 bits are aliased to the 16 SSE registers (XMM0-XMM15)
- * and are stored in 'struct i387_fxsave_struct::xmm_space[]'.
+ * and are stored in 'struct fxregs_state::xmm_space[]'.
*
* The high 128 bits are stored here:
* 16x 128 bits == 256 bytes.
@@ -140,8 +140,8 @@ struct xstate_header {
u64 reserved[6];
} __attribute__((packed));

-struct xsave_struct {
- struct i387_fxsave_struct i387;
+struct xregs_state {
+ struct fxregs_state i387;
struct xstate_header header;
struct ymmh_struct ymmh;
struct lwp_struct lwp;
@@ -150,11 +150,11 @@ struct xsave_struct {
/* New processor state extensions will go here. */
} __attribute__ ((packed, aligned (64)));

-union thread_xstate {
- struct i387_fsave_struct fsave;
- struct i387_fxsave_struct fxsave;
- struct i387_soft_struct soft;
- struct xsave_struct xsave;
+union fpregs_state {
+ struct fregs_state fsave;
+ struct fxregs_state fxsave;
+ struct swregs_state soft;
+ struct xregs_state xsave;
};

struct fpu {
@@ -171,7 +171,7 @@ struct fpu {
unsigned int last_cpu;

unsigned int fpregs_active;
- union thread_xstate state;
+ union fpregs_state state;
/*
* This counter contains the number of consecutive context switches
* during which the FPU stays used. If this is over a threshold, the
diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 7f59480697a3..a6181b9ebf42 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -58,7 +58,7 @@ extern void update_regset_xstate_info(unsigned int size, u64 xstate_mask);
* This function is called only during boot time when x86 caps are not set
* up and alternative can not be used yet.
*/
-static inline int copy_xregs_to_kernel_booting(struct xsave_struct *fx)
+static inline int copy_xregs_to_kernel_booting(struct xregs_state *fx)
{
u64 mask = -1;
u32 lmask = mask;
@@ -86,7 +86,7 @@ static inline int copy_xregs_to_kernel_booting(struct xsave_struct *fx)
* This function is called only during boot time when x86 caps are not set
* up and alternative can not be used yet.
*/
-static inline int copy_kernel_to_xregs_booting(struct xsave_struct *fx, u64 mask)
+static inline int copy_kernel_to_xregs_booting(struct xregs_state *fx, u64 mask)
{
u32 lmask = mask;
u32 hmask = mask >> 32;
@@ -112,7 +112,7 @@ static inline int copy_kernel_to_xregs_booting(struct xsave_struct *fx, u64 mask
/*
* Save processor xstate to xsave area.
*/
-static inline int copy_xregs_to_kernel(struct xsave_struct *fx)
+static inline int copy_xregs_to_kernel(struct xregs_state *fx)
{
u64 mask = -1;
u32 lmask = mask;
@@ -151,7 +151,7 @@ static inline int copy_xregs_to_kernel(struct xsave_struct *fx)
/*
* Restore processor xstate from xsave area.
*/
-static inline int copy_kernel_to_xregs(struct xsave_struct *fx, u64 mask)
+static inline int copy_kernel_to_xregs(struct xregs_state *fx, u64 mask)
{
int err = 0;
u32 lmask = mask;
@@ -186,7 +186,7 @@ static inline int copy_kernel_to_xregs(struct xsave_struct *fx, u64 mask)
* backward compatibility for old applications which don't understand
* compacted format of xsave area.
*/
-static inline int copy_xregs_to_user(struct xsave_struct __user *buf)
+static inline int copy_xregs_to_user(struct xregs_state __user *buf)
{
int err;

@@ -210,10 +210,10 @@ static inline int copy_xregs_to_user(struct xsave_struct __user *buf)
/*
* Restore xstate from user space xsave area.
*/
-static inline int copy_user_to_xregs(struct xsave_struct __user *buf, u64 mask)
+static inline int copy_user_to_xregs(struct xregs_state __user *buf, u64 mask)
{
int err = 0;
- struct xsave_struct *xstate = ((__force struct xsave_struct *)buf);
+ struct xregs_state *xstate = ((__force struct xregs_state *)buf);
u32 lmask = mask;
u32 hmask = mask >> 32;

@@ -226,7 +226,7 @@ static inline int copy_user_to_xregs(struct xsave_struct __user *buf, u64 mask)
return err;
}

-void *get_xsave_addr(struct xsave_struct *xsave, int xstate);
+void *get_xsave_addr(struct xregs_state *xsave, int xstate);
void setup_xstate_comp(void);

#endif
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index a952a13d59a7..f3c1b71d4fae 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -60,8 +60,8 @@

#ifdef CONFIG_X86_INTEL_MPX
siginfo_t *mpx_generate_siginfo(struct pt_regs *regs,
- struct xsave_struct *xsave_buf);
-int mpx_handle_bd_fault(struct xsave_struct *xsave_buf);
+ struct xregs_state *xsave_buf);
+int mpx_handle_bd_fault(struct xregs_state *xsave_buf);
static inline int kernel_managing_mpx_tables(struct mm_struct *mm)
{
return (mm->bd_addr != MPX_INVALID_BOUNDS_DIR);
@@ -78,11 +78,11 @@ void mpx_notify_unmap(struct mm_struct *mm, struct vm_area_struct *vma,
unsigned long start, unsigned long end);
#else
static inline siginfo_t *mpx_generate_siginfo(struct pt_regs *regs,
- struct xsave_struct *xsave_buf)
+ struct xregs_state *xsave_buf)
{
return NULL;
}
-static inline int mpx_handle_bd_fault(struct xsave_struct *xsave_buf)
+static inline int mpx_handle_bd_fault(struct xregs_state *xsave_buf)
{
return -EINVAL;
}
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index f3443b9fb7d8..0acdfc5f8d19 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -16,7 +16,7 @@
* Represents the initial FPU state. It's mostly (but not completely) zeroes,
* depending on the FPU hardware format:
*/
-union thread_xstate init_fpstate __read_mostly;
+union fpregs_state init_fpstate __read_mostly;

/*
* Track whether the kernel is using the FPU state
@@ -200,7 +200,7 @@ EXPORT_SYMBOL_GPL(fpu__save);
/*
* Legacy x87 fpstate state init:
*/
-static inline void fpstate_init_fstate(struct i387_fsave_struct *fp)
+static inline void fpstate_init_fstate(struct fregs_state *fp)
{
fp->cwd = 0xffff037fu;
fp->swd = 0xffff0000u;
@@ -208,7 +208,7 @@ static inline void fpstate_init_fstate(struct i387_fsave_struct *fp)
fp->fos = 0xffff0000u;
}

-void fpstate_init(union thread_xstate *state)
+void fpstate_init(union fpregs_state *state)
{
if (!cpu_has_fpu) {
fpstate_init_soft(&state->soft);
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index 93bc11a5812c..504370662899 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -95,7 +95,7 @@ static void fpu__init_system_mxcsr(void)
unsigned int mask = 0;

if (cpu_has_fxsr) {
- struct i387_fxsave_struct fx_tmp __aligned(32) = { };
+ struct fxregs_state fx_tmp __aligned(32) = { };

asm volatile("fxsave %0" : "+m" (fx_tmp));

@@ -155,12 +155,12 @@ static void fpu__init_system_xstate_size_legacy(void)
*/
setup_clear_cpu_cap(X86_FEATURE_XSAVE);
setup_clear_cpu_cap(X86_FEATURE_XSAVEOPT);
- xstate_size = sizeof(struct i387_soft_struct);
+ xstate_size = sizeof(struct swregs_state);
} else {
if (cpu_has_fxsr)
- xstate_size = sizeof(struct i387_fxsave_struct);
+ xstate_size = sizeof(struct fxregs_state);
else
- xstate_size = sizeof(struct i387_fsave_struct);
+ xstate_size = sizeof(struct fregs_state);
}
}

diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index 1f58a1c2a941..297b3da8e4c4 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -76,7 +76,7 @@ int xstateregs_get(struct task_struct *target, const struct user_regset *regset,
void *kbuf, void __user *ubuf)
{
struct fpu *fpu = &target->thread.fpu;
- struct xsave_struct *xsave;
+ struct xregs_state *xsave;
int ret;

if (!cpu_has_xsave)
@@ -105,7 +105,7 @@ int xstateregs_set(struct task_struct *target, const struct user_regset *regset,
const void *kbuf, const void __user *ubuf)
{
struct fpu *fpu = &target->thread.fpu;
- struct xsave_struct *xsave;
+ struct xregs_state *xsave;
int ret;

if (!cpu_has_xsave)
@@ -156,7 +156,7 @@ static inline unsigned short twd_i387_to_fxsr(unsigned short twd)
#define FP_EXP_TAG_SPECIAL 2
#define FP_EXP_TAG_EMPTY 3

-static inline u32 twd_fxsr_to_i387(struct i387_fxsave_struct *fxsave)
+static inline u32 twd_fxsr_to_i387(struct fxregs_state *fxsave)
{
struct _fpxreg *st;
u32 tos = (fxsave->swd >> 11) & 7;
@@ -204,7 +204,7 @@ static inline u32 twd_fxsr_to_i387(struct i387_fxsave_struct *fxsave)
void
convert_from_fxsr(struct user_i387_ia32_struct *env, struct task_struct *tsk)
{
- struct i387_fxsave_struct *fxsave = &tsk->thread.fpu.state.fxsave;
+ struct fxregs_state *fxsave = &tsk->thread.fpu.state.fxsave;
struct _fpreg *to = (struct _fpreg *) &env->st_space[0];
struct _fpxreg *from = (struct _fpxreg *) &fxsave->st_space[0];
int i;
@@ -242,7 +242,7 @@ void convert_to_fxsr(struct task_struct *tsk,
const struct user_i387_ia32_struct *env)

{
- struct i387_fxsave_struct *fxsave = &tsk->thread.fpu.state.fxsave;
+ struct fxregs_state *fxsave = &tsk->thread.fpu.state.fxsave;
struct _fpreg *from = (struct _fpreg *) &env->st_space[0];
struct _fpxreg *to = (struct _fpxreg *) &fxsave->st_space[0];
int i;
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 8d0c26ab5123..99f73093333d 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -17,11 +17,11 @@ static struct _fpx_sw_bytes fx_sw_reserved, fx_sw_reserved_ia32;
* Check for the presence of extended state information in the
* user fpstate pointer in the sigcontext.
*/
-static inline int check_for_xstate(struct i387_fxsave_struct __user *buf,
+static inline int check_for_xstate(struct fxregs_state __user *buf,
void __user *fpstate,
struct _fpx_sw_bytes *fx_sw)
{
- int min_xstate_size = sizeof(struct i387_fxsave_struct) +
+ int min_xstate_size = sizeof(struct fxregs_state) +
sizeof(struct xstate_header);
unsigned int magic2;

@@ -54,7 +54,7 @@ static inline int check_for_xstate(struct i387_fxsave_struct __user *buf,
static inline int save_fsave_header(struct task_struct *tsk, void __user *buf)
{
if (use_fxsr()) {
- struct xsave_struct *xsave = &tsk->thread.fpu.state.xsave;
+ struct xregs_state *xsave = &tsk->thread.fpu.state.xsave;
struct user_i387_ia32_struct env;
struct _fpstate_ia32 __user *fp = buf;

@@ -65,7 +65,7 @@ static inline int save_fsave_header(struct task_struct *tsk, void __user *buf)
__put_user(X86_FXSR_MAGIC, &fp->magic))
return -1;
} else {
- struct i387_fsave_struct __user *fp = buf;
+ struct fregs_state __user *fp = buf;
u32 swd;
if (__get_user(swd, &fp->swd) || __put_user(swd, &fp->status))
return -1;
@@ -76,7 +76,7 @@ static inline int save_fsave_header(struct task_struct *tsk, void __user *buf)

static inline int save_xstate_epilog(void __user *buf, int ia32_frame)
{
- struct xsave_struct __user *x = buf;
+ struct xregs_state __user *x = buf;
struct _fpx_sw_bytes *sw_bytes;
u32 xfeatures;
int err;
@@ -114,16 +114,16 @@ static inline int save_xstate_epilog(void __user *buf, int ia32_frame)
return err;
}

-static inline int copy_fpregs_to_sigframe(struct xsave_struct __user *buf)
+static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
{
int err;

if (use_xsave())
err = copy_xregs_to_user(buf);
else if (use_fxsr())
- err = copy_fxregs_to_user((struct i387_fxsave_struct __user *) buf);
+ err = copy_fxregs_to_user((struct fxregs_state __user *) buf);
else
- err = copy_fregs_to_user((struct i387_fsave_struct __user *) buf);
+ err = copy_fregs_to_user((struct fregs_state __user *) buf);

if (unlikely(err) && __clear_user(buf, xstate_size))
err = -EFAULT;
@@ -152,7 +152,7 @@ static inline int copy_fpregs_to_sigframe(struct xsave_struct __user *buf)
*/
int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
{
- struct xsave_struct *xsave = &current->thread.fpu.state.xsave;
+ struct xregs_state *xsave = &current->thread.fpu.state.xsave;
struct task_struct *tsk = current;
int ia32_fxstate = (buf != buf_fx);

@@ -195,7 +195,7 @@ sanitize_restored_xstate(struct task_struct *tsk,
struct user_i387_ia32_struct *ia32_env,
u64 xfeatures, int fx_only)
{
- struct xsave_struct *xsave = &tsk->thread.fpu.state.xsave;
+ struct xregs_state *xsave = &tsk->thread.fpu.state.xsave;
struct xstate_header *header = &xsave->header;

if (use_xsave()) {
@@ -280,7 +280,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
* memory layout. Restore just the FP/SSE and init all
* the other extended state.
*/
- state_size = sizeof(struct i387_fxsave_struct);
+ state_size = sizeof(struct fxregs_state);
fx_only = 1;
} else {
state_size = fx_sw_user.xstate_size;
@@ -353,8 +353,8 @@ int fpu__restore_sig(void __user *buf, int ia32_frame)
int size = xstate_sigframe_size();

if (ia32_frame && use_fxsr()) {
- buf_fx = buf + sizeof(struct i387_fsave_struct);
- size += sizeof(struct i387_fsave_struct);
+ buf_fx = buf + sizeof(struct fregs_state);
+ size += sizeof(struct fregs_state);
}

return __fpu__restore_sig(buf, buf_fx, size);
@@ -368,8 +368,8 @@ fpu__alloc_mathframe(unsigned long sp, int ia32_frame,

*buf_fx = sp = round_down(sp - frame_size, 64);
if (ia32_frame && use_fxsr()) {
- frame_size += sizeof(struct i387_fsave_struct);
- sp -= sizeof(struct i387_fsave_struct);
+ frame_size += sizeof(struct fregs_state);
+ sp -= sizeof(struct fregs_state);
}

*size = frame_size;
@@ -385,7 +385,7 @@ fpu__alloc_mathframe(unsigned long sp, int ia32_frame,
*/
void fpu__init_prepare_fx_sw_frame(void)
{
- int fsave_header_size = sizeof(struct i387_fsave_struct);
+ int fsave_header_size = sizeof(struct fregs_state);
int size = xstate_size + FP_XSTATE_MAGIC2_SIZE;

if (config_enabled(CONFIG_X86_32))
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 3629e2ef3c94..733a8aec7bd7 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -91,7 +91,7 @@ EXPORT_SYMBOL_GPL(cpu_has_xfeatures);
*/
void fpstate_sanitize_xstate(struct fpu *fpu)
{
- struct i387_fxsave_struct *fx = &fpu->state.fxsave;
+ struct fxregs_state *fx = &fpu->state.fxsave;
int feature_bit;
u64 xfeatures;

@@ -231,7 +231,7 @@ void setup_xstate_comp(void)
* or standard form.
*/
xstate_comp_offsets[0] = 0;
- xstate_comp_offsets[1] = offsetof(struct i387_fxsave_struct, xmm_space);
+ xstate_comp_offsets[1] = offsetof(struct fxregs_state, xmm_space);

if (!cpu_has_xsaves) {
for (i = 2; i < xfeatures_nr; i++) {
@@ -386,7 +386,7 @@ void fpu__resume_cpu(void)
* Output:
* address of the state in the xsave area.
*/
-void *get_xsave_addr(struct xsave_struct *xsave, int xstate)
+void *get_xsave_addr(struct xregs_state *xsave, int xstate)
{
int feature = fls64(xstate) - 1;
if (!test_bit(feature, (unsigned long *)&xfeatures_mask))
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index cab397d0085f..6f581c65c648 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -371,7 +371,7 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code)
dotraplinkage void do_bounds(struct pt_regs *regs, long error_code)
{
struct task_struct *tsk = current;
- struct xsave_struct *xsave_buf;
+ struct xregs_state *xsave_buf;
enum ctx_state prev_state;
struct bndcsr *bndcsr;
siginfo_t *info;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 3d811bb2728f..e14a7a65e975 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3195,7 +3195,7 @@ static int kvm_vcpu_ioctl_x86_set_debugregs(struct kvm_vcpu *vcpu,

static void fill_xsave(u8 *dest, struct kvm_vcpu *vcpu)
{
- struct xsave_struct *xsave = &vcpu->arch.guest_fpu.state.xsave;
+ struct xregs_state *xsave = &vcpu->arch.guest_fpu.state.xsave;
u64 xstate_bv = xsave->header.xfeatures;
u64 valid;

@@ -3231,7 +3231,7 @@ static void fill_xsave(u8 *dest, struct kvm_vcpu *vcpu)

static void load_xsave(struct kvm_vcpu *vcpu, u8 *src)
{
- struct xsave_struct *xsave = &vcpu->arch.guest_fpu.state.xsave;
+ struct xregs_state *xsave = &vcpu->arch.guest_fpu.state.xsave;
u64 xstate_bv = *(u64 *)(src + XSAVE_HDR_OFFSET);
u64 valid;

@@ -3277,7 +3277,7 @@ static void kvm_vcpu_ioctl_x86_get_xsave(struct kvm_vcpu *vcpu,
} else {
memcpy(guest_xsave->region,
&vcpu->arch.guest_fpu.state.fxsave,
- sizeof(struct i387_fxsave_struct));
+ sizeof(struct fxregs_state));
*(u64 *)&guest_xsave->region[XSAVE_HDR_OFFSET / sizeof(u32)] =
XSTATE_FPSSE;
}
@@ -3302,7 +3302,7 @@ static int kvm_vcpu_ioctl_x86_set_xsave(struct kvm_vcpu *vcpu,
if (xstate_bv & ~XSTATE_FPSSE)
return -EINVAL;
memcpy(&vcpu->arch.guest_fpu.state.fxsave,
- guest_xsave->region, sizeof(struct i387_fxsave_struct));
+ guest_xsave->region, sizeof(struct fxregs_state));
}
return 0;
}
@@ -6970,7 +6970,7 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu,

int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
{
- struct i387_fxsave_struct *fxsave =
+ struct fxregs_state *fxsave =
&vcpu->arch.guest_fpu.state.fxsave;

memcpy(fpu->fpr, fxsave->st_space, 128);
@@ -6987,7 +6987,7 @@ int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)

int kvm_arch_vcpu_ioctl_set_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
{
- struct i387_fxsave_struct *fxsave =
+ struct fxregs_state *fxsave =
&vcpu->arch.guest_fpu.state.fxsave;

memcpy(fxsave->st_space, fpu->fpr, 128);
diff --git a/arch/x86/math-emu/fpu_aux.c b/arch/x86/math-emu/fpu_aux.c
index 768b2b8271d6..dd76a05729b0 100644
--- a/arch/x86/math-emu/fpu_aux.c
+++ b/arch/x86/math-emu/fpu_aux.c
@@ -30,7 +30,7 @@ static void fclex(void)
}

/* Needs to be externally visible */
-void fpstate_init_soft(struct i387_soft_struct *soft)
+void fpstate_init_soft(struct swregs_state *soft)
{
struct address *oaddr, *iaddr;
memset(soft, 0, sizeof(*soft));
diff --git a/arch/x86/math-emu/fpu_entry.c b/arch/x86/math-emu/fpu_entry.c
index 5b850514eb68..f37e84ab49f3 100644
--- a/arch/x86/math-emu/fpu_entry.c
+++ b/arch/x86/math-emu/fpu_entry.c
@@ -669,7 +669,7 @@ void math_abort(struct math_emu_info *info, unsigned int signal)
#endif /* PARANOID */
}

-#define S387 ((struct i387_soft_struct *)s387)
+#define S387 ((struct swregs_state *)s387)
#define sstatus_word() \
((S387->swd & ~SW_Top & 0xffff) | ((S387->ftop << SW_Top_Shift) & SW_Top))

@@ -678,14 +678,14 @@ int fpregs_soft_set(struct task_struct *target,
unsigned int pos, unsigned int count,
const void *kbuf, const void __user *ubuf)
{
- struct i387_soft_struct *s387 = &target->thread.fpu.state.soft;
+ struct swregs_state *s387 = &target->thread.fpu.state.soft;
void *space = s387->st_space;
int ret;
int offset, other, i, tags, regnr, tag, newtop;

RE_ENTRANT_CHECK_OFF;
ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, s387, 0,
- offsetof(struct i387_soft_struct, st_space));
+ offsetof(struct swregs_state, st_space));
RE_ENTRANT_CHECK_ON;

if (ret)
@@ -730,7 +730,7 @@ int fpregs_soft_get(struct task_struct *target,
unsigned int pos, unsigned int count,
void *kbuf, void __user *ubuf)
{
- struct i387_soft_struct *s387 = &target->thread.fpu.state.soft;
+ struct swregs_state *s387 = &target->thread.fpu.state.soft;
const void *space = s387->st_space;
int ret;
int offset = (S387->ftop & 7) * 10, other = 80 - offset;
@@ -748,7 +748,7 @@ int fpregs_soft_get(struct task_struct *target,
#endif /* PECULIAR_486 */

ret = user_regset_copyout(&pos, &count, &kbuf, &ubuf, s387, 0,
- offsetof(struct i387_soft_struct, st_space));
+ offsetof(struct swregs_state, st_space));

/* Copy all registers in stack order. */
if (!ret)
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index 5e20bacee210..2e0dfd39bd22 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -272,7 +272,7 @@ static int mpx_insn_decode(struct insn *insn,
* The caller is expected to kfree() the returned siginfo_t.
*/
siginfo_t *mpx_generate_siginfo(struct pt_regs *regs,
- struct xsave_struct *xsave_buf)
+ struct xregs_state *xsave_buf)
{
struct bndreg *bndregs, *bndreg;
siginfo_t *info = NULL;
@@ -497,7 +497,7 @@ static int allocate_bt(long __user *bd_entry)
* bound table is 16KB. With 64-bit mode, the size of BD is 2GB,
* and the size of each bound table is 4MB.
*/
-static int do_mpx_bt_fault(struct xsave_struct *xsave_buf)
+static int do_mpx_bt_fault(struct xregs_state *xsave_buf)
{
unsigned long bd_entry, bd_base;
struct bndcsr *bndcsr;
@@ -525,7 +525,7 @@ static int do_mpx_bt_fault(struct xsave_struct *xsave_buf)
return allocate_bt((long __user *)bd_entry);
}

-int mpx_handle_bd_fault(struct xsave_struct *xsave_buf)
+int mpx_handle_bd_fault(struct xregs_state *xsave_buf)
{
/*
* Userspace never asked us to manage the bounds tables,
--
2.1.0

2015-05-05 18:03:47

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 197/208] x86/fpu: Change fpu->fpregs_active from 'int' to 'char', add lazy switching comments

Improve the memory layout of 'struct fpu':

- change ->fpregs_active from 'int' to 'char' - it's just a single flag
and modern x86 CPUs can do efficient byte accesses.

- pack related fields closer to each other: often 'fpu->state' will not be
touched, while the other fields will - so pack them into a group.

Also add comments to each field, describing their purpose, and add
some background information about lazy restores.

Also fix an obsolete, lazy switching related comment in fpu_copy()'s description.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/fpu/types.h | 82 ++++++++++++++++++++++++++++++++++++++++++++++++++++--------
arch/x86/kernel/fpu/core.c | 6 ++---
arch/x86/kernel/fpu/xstate.c | 9 ++++---
3 files changed, 79 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index fe2ce3276a38..261cfb76065f 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -159,8 +159,44 @@ union fpregs_state {

struct fpu {
/*
+ * @state:
+ *
+ * In-memory copy of all FPU registers that we save/restore
+ * over context switches. If the task is using the FPU then
+ * the registers in the FPU are more recent than this state
+ * copy. If the task context-switches away then they get
+ * saved here and represent the FPU state.
+ *
+ * After context switches there may be a (short) time period
+ * during which the in-FPU hardware registers are unchanged
+ * and still perfectly match this state, if the tasks
+ * scheduled afterwards are not using the FPU.
+ *
+ * This is the 'lazy restore' window of optimization, which
+ * we track though 'fpu_fpregs_owner_ctx' and 'fpu->last_cpu'.
+ *
+ * We detect whether a subsequent task uses the FPU via setting
+ * CR0::TS to 1, which causes any FPU use to raise a #NM fault.
+ *
+ * During this window, if the task gets scheduled again, we
+ * might be able to skip having to do a restore from this
+ * memory buffer to the hardware registers - at the cost of
+ * incurring the overhead of #NM fault traps.
+ *
+ * Note that on modern CPUs that support the XSAVEOPT (or other
+ * optimized XSAVE instructions), we don't use #NM traps anymore,
+ * as the hardware can track whether FPU registers need saving
+ * or not. On such CPUs we activate the non-lazy ('eagerfpu')
+ * logic, which unconditionally saves/restores all FPU state
+ * across context switches. (if FPU state exists.)
+ */
+ union fpregs_state state;
+
+ /*
+ * @last_cpu:
+ *
* Records the last CPU on which this context was loaded into
- * FPU registers. (In the lazy-switching case we might be
+ * FPU registers. (In the lazy-restore case we might be
* able to reuse FPU registers across multiple context switches
* this way, if no intermediate task used the FPU.)
*
@@ -170,23 +206,49 @@ struct fpu {
*/
unsigned int last_cpu;

- unsigned int fpregs_active;
- union fpregs_state state;
/*
+ * @fpstate_active:
+ *
+ * This flag indicates whether this context is active: if the task
+ * is not running then we can restore from this context, if the task
+ * is running then we should save into this context.
+ */
+ unsigned char fpstate_active;
+
+ /*
+ * @fpregs_active:
+ *
+ * This flag determines whether a given context is actively
+ * loaded into the FPU's registers and that those registers
+ * represent the task's current FPU state.
+ *
+ * Note the interaction with fpstate_active:
+ *
+ * # task does not use the FPU:
+ * fpstate_active == 0
+ *
+ * # task uses the FPU and regs are active:
+ * fpstate_active == 1 && fpregs_active == 1
+ *
+ * # the regs are inactive but still match fpstate:
+ * fpstate_active == 1 && fpregs_active == 0 && fpregs_owner == fpu
+ *
+ * The third state is what we use for the lazy restore optimization
+ * on lazy-switching CPUs.
+ */
+ unsigned char fpregs_active;
+
+ /*
+ * @counter:
+ *
* This counter contains the number of consecutive context switches
* during which the FPU stays used. If this is over a threshold, the
- * lazy fpu saving logic becomes unlazy, to save the trap overhead.
+ * lazy FPU restore logic becomes eager, to save the trap overhead.
* This is an unsigned char so that after 256 iterations the counter
* wraps and the context switch behavior turns lazy again; this is to
* deal with bursty apps that only use the FPU for a short time:
*/
unsigned char counter;
- /*
- * This flag indicates whether this context is fpstate_active: if the task is
- * not running then we can restore from this context, if the task
- * is running then we should save into this context.
- */
- unsigned char fpstate_active;
};

#endif /* _ASM_X86_FPU_H */
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 0acdfc5f8d19..63496c49a590 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -227,10 +227,8 @@ EXPORT_SYMBOL_GPL(fpstate_init);
/*
* Copy the current task's FPU state to a new task's FPU context.
*
- * In the 'eager' case we just save to the destination context.
- *
- * In the 'lazy' case we save to the source context, mark the FPU lazy
- * via stts() and copy the source context into the destination context.
+ * In both the 'eager' and the 'lazy' case we save hardware registers
+ * directly to the destination buffer.
*/
static void fpu_copy(struct fpu *dst_fpu, struct fpu *src_fpu)
{
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 733a8aec7bd7..cd7f1a6bd933 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -76,10 +76,11 @@ int cpu_has_xfeatures(u64 xfeatures_needed, const char **feature_name)
EXPORT_SYMBOL_GPL(cpu_has_xfeatures);

/*
- * When executing XSAVEOPT (optimized XSAVE), if a processor implementation
- * detects that an FPU state component is still (or is again) in its
- * initialized state, it may clear the corresponding bit in the header.xfeatures
- * field, and can skip the writeout of registers to the corresponding memory layout.
+ * When executing XSAVEOPT (or other optimized XSAVE instructions), if
+ * a processor implementation detects that an FPU state component is still
+ * (or is again) in its initialized state, it may clear the corresponding
+ * bit in the header.xfeatures field, and can skip the writeout of registers
+ * to the corresponding memory layout.
*
* This means that when the bit is zero, the state component might still contain
* some previous - non-initialized register state.
--
2.1.0

2015-05-05 18:03:03

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 198/208] x86/fpu: Document the various fpregs state formats

Document all the structures that make up 'struct fpu'.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/fpu/types.h | 35 +++++++++++++++++++++++++++++++++--
1 file changed, 33 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index 261cfb76065f..4c4eceb08a42 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -4,6 +4,10 @@
#ifndef _ASM_X86_FPU_H
#define _ASM_X86_FPU_H

+/*
+ * The legacy x87 FPU state format, as saved by FSAVE and
+ * restored by the FRSTOR instructions:
+ */
struct fregs_state {
u32 cwd; /* FPU Control Word */
u32 swd; /* FPU Status Word */
@@ -16,10 +20,16 @@ struct fregs_state {
/* 8*10 bytes for each FP-reg = 80 bytes: */
u32 st_space[20];

- /* Software status information [not touched by FSAVE ]: */
+ /* Software status information [not touched by FSAVE]: */
u32 status;
};

+/*
+ * The legacy fx SSE/MMX FPU state format, as saved by FXSAVE and
+ * restored by the FXRSTOR instructions. It's similar to the FSAVE
+ * format, but differs in some areas, plus has extensions at
+ * the end for the XMM registers.
+ */
struct fxregs_state {
u16 cwd; /* Control Word */
u16 swd; /* Status Word */
@@ -56,7 +66,8 @@ struct fxregs_state {
} __attribute__((aligned(16)));

/*
- * Software based FPU emulation state:
+ * Software based FPU emulation state. This is arbitrary really,
+ * it matches the x87 format to make it easier to understand:
*/
struct swregs_state {
u32 cwd;
@@ -140,6 +151,14 @@ struct xstate_header {
u64 reserved[6];
} __attribute__((packed));

+/*
+ * This is our most modern FPU state format, as saved by the XSAVE
+ * and restored by the XRSTOR instructions.
+ *
+ * It consists of a legacy fxregs portion, an xstate header and
+ * subsequent fixed size areas as defined by the xstate header.
+ * Not all CPUs support all the extensions.
+ */
struct xregs_state {
struct fxregs_state i387;
struct xstate_header header;
@@ -150,6 +169,13 @@ struct xregs_state {
/* New processor state extensions will go here. */
} __attribute__ ((packed, aligned (64)));

+/*
+ * This is a union of all the possible FPU state formats
+ * put together, so that we can pick the right one runtime.
+ *
+ * The size of the structure is determined by the largest
+ * member - which is the xsave area:
+ */
union fpregs_state {
struct fregs_state fsave;
struct fxregs_state fxsave;
@@ -157,6 +183,11 @@ union fpregs_state {
struct xregs_state xsave;
};

+/*
+ * Highest level per task FPU state data structure that
+ * contains the FPU register state plus various FPU
+ * state fields:
+ */
struct fpu {
/*
* @state:
--
2.1.0

2015-05-05 17:59:59

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 199/208] x86/fpu: Move debugging check from kernel_fpu_begin() to __kernel_fpu_begin()

kernel_fpu_begin() is __kernel_fpu_begin() with a preempt_disable().

Move the kernel_fpu_begin() debugging check into __kernel_fpu_begin(),
so that users of __kernel_fpu_begin() may benefit from it as well.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/fpu/core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 63496c49a590..843901f10754 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -109,6 +109,8 @@ void __kernel_fpu_begin(void)
{
struct fpu *fpu = &current->thread.fpu;

+ WARN_ON_ONCE(!irq_fpu_usable());
+
kernel_fpu_disable();

if (fpu->fpregs_active) {
@@ -138,7 +140,6 @@ EXPORT_SYMBOL(__kernel_fpu_end);
void kernel_fpu_begin(void)
{
preempt_disable();
- WARN_ON_ONCE(!irq_fpu_usable());
__kernel_fpu_begin();
}
EXPORT_SYMBOL_GPL(kernel_fpu_begin);
--
2.1.0

2015-05-05 18:02:36

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 200/208] x86/fpu/xstate: Don't assume the first zero xfeatures zero bit means the end

The current xstate code in setup_xstate_features() assumes that
the first zero bit means the end of xfeatures - but that is not
so, the SDM clearly states that an arbitrary set of xfeatures
might be enabled - and it is also clear from the description
of the compaction feature that holes are possible:

"13-6 Vol. 1MANAGING STATE USING THE XSAVE FEATURE SET
[...]

Compacted format. Each state component i (i ≥ 2) is located at a byte
offset from the base address of the XSAVE area based on the XCOMP_BV
field in the XSAVE header:

— If XCOMP_BV[i] = 0, state component i is not in the XSAVE area.

— If XCOMP_BV[i] = 1, the following items apply:

• If XCOMP_BV[j] = 0 for every j, 2 ≤ j < i, state component i is
located at a byte offset 576 from the base address of the XSAVE
area. (This item applies if i is the first bit set in bits 62:2 of
the XCOMP_BV; it implies that state component i is located at the
beginning of the extended region.)

• Otherwise, let j, 2 ≤ j < i, be the greatest value such that
XCOMP_BV[j] = 1. Then state component i is located at a byte offset
X from the location of state component j, where X is the number of
bytes required for state component j as enumerated in
CPUID.(EAX=0DH,ECX=j):EAX. (This item implies that state component i
immediately follows the preceding state component whose bit is set
in XCOMP_BV.)"

So don't assume that the first zero xfeatures bit means the end of
all xfeatures - iterate through all of them.

I'm not aware of hardware that triggers this currently.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/fpu/xstate.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index cd7f1a6bd933..a024fa591a93 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -168,26 +168,27 @@ void fpu__init_cpu_xstate(void)
}

/*
- * Record the offsets and sizes of different state managed by the xsave
- * memory layout.
+ * Record the offsets and sizes of various xstates contained
+ * in the XSAVE state memory layout.
+ *
+ * ( Note that certain features might be non-present, for them
+ * we'll have 0 offset and 0 size. )
*/
static void __init setup_xstate_features(void)
{
- int eax, ebx, ecx, edx, leaf = 0x2;
+ u32 eax, ebx, ecx, edx, leaf;

xfeatures_nr = fls64(xfeatures_mask);

- do {
+ for (leaf = 2; leaf < xfeatures_nr; leaf++) {
cpuid_count(XSTATE_CPUID, leaf, &eax, &ebx, &ecx, &edx);

- if (eax == 0)
- break;
-
xstate_offsets[leaf] = ebx;
xstate_sizes[leaf] = eax;

+ printk(KERN_INFO "x86/fpu: xstate_offset[%d]: %04x, xstate_sizes[%d]: %04x\n", leaf, ebx, leaf, eax);
leaf++;
- } while (1);
+ }
}

static void print_xstate_feature(u64 xstate_mask)
--
2.1.0

2015-05-05 18:00:08

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 201/208] x86/fpu: Clean up xstate feature reservation

Put MPX support into its separate high level structure, and
also replace the fixed YMM, LWP and MPX structures in
xregs_state with just reservations - their exact offsets
in the structure will depend on the CPU and no code actually
relies on those fields.

No change in functionality.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/fpu/types.h | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index 4c4eceb08a42..02241c2a10e9 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -145,12 +145,21 @@ struct bndcsr {
u64 bndstatus;
} __packed;

+struct mpx_struct {
+ struct bndreg bndreg[4];
+ struct bndcsr bndcsr;
+};
+
struct xstate_header {
u64 xfeatures;
u64 xcomp_bv;
u64 reserved[6];
} __attribute__((packed));

+/* New processor state extensions should be added here: */
+#define XSTATE_RESERVE (sizeof(struct ymmh_struct) + \
+ sizeof(struct lwp_struct) + \
+ sizeof(struct mpx_struct) )
/*
* This is our most modern FPU state format, as saved by the XSAVE
* and restored by the XRSTOR instructions.
@@ -162,11 +171,7 @@ struct xstate_header {
struct xregs_state {
struct fxregs_state i387;
struct xstate_header header;
- struct ymmh_struct ymmh;
- struct lwp_struct lwp;
- struct bndreg bndreg[4];
- struct bndcsr bndcsr;
- /* New processor state extensions will go here. */
+ u8 __reserved[XSTATE_RESERVE];
} __attribute__ ((packed, aligned (64)));

/*
--
2.1.0

2015-05-05 18:00:12

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 202/208] x86/fpu/xstate: Clean up setup_xstate_comp() call

So call setup_xstate_comp() from the xstate init code, not
from the generic fpu__init_system() code.

This allows us to remove the protytype from xstate.h as well.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/fpu/xstate.h | 1 -
arch/x86/kernel/fpu/init.c | 1 -
arch/x86/kernel/fpu/xstate.c | 6 ++----
3 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index a6181b9ebf42..8f336d2ae126 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -227,6 +227,5 @@ static inline int copy_user_to_xregs(struct xregs_state __user *buf, u64 mask)
}

void *get_xsave_addr(struct xregs_state *xsave, int xstate);
-void setup_xstate_comp(void);

#endif
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index 504370662899..889025217407 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -262,7 +262,6 @@ void fpu__init_system(struct cpuinfo_x86 *c)
fpu__init_system_generic();
fpu__init_system_xstate_size_legacy();
fpu__init_system_xstate();
- setup_xstate_comp();

fpu__init_system_ctx_switch();
}
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index a024fa591a93..9e77332f00e4 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -218,11 +218,8 @@ static void print_xstate_features(void)
* This function sets up offsets and sizes of all extended states in
* xsave area. This supports both standard format and compacted format
* of the xsave aread.
- *
- * Input: void
- * Output: void
*/
-void setup_xstate_comp(void)
+static void setup_xstate_comp(void)
{
unsigned int xstate_comp_sizes[sizeof(xfeatures_mask)*8];
int i;
@@ -355,6 +352,7 @@ void fpu__init_system_xstate(void)
update_regset_xstate_info(xstate_size, xfeatures_mask);
fpu__init_prepare_fx_sw_frame();
setup_init_fpu_buf();
+ setup_xstate_comp();

pr_info("x86/fpu: Enabled xstate features 0x%llx, context size is 0x%x bytes, using '%s' format.\n",
xfeatures_mask,
--
2.1.0

2015-05-05 18:00:17

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 203/208] x86/fpu/init: Propagate __init annotations

Now that all the FPU init function call dependencies are
cleaned up we can propagate __init annotations deeper.

This shrinks the runtime size of the kernel a bit, and
also addresses a few section warnings.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/fpu/bugs.c | 2 +-
arch/x86/kernel/fpu/init.c | 12 ++++++------
arch/x86/kernel/fpu/xstate.c | 10 +++++-----
3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/fpu/bugs.c b/arch/x86/kernel/fpu/bugs.c
index 449b5f3f4925..dd9ca9b60ff3 100644
--- a/arch/x86/kernel/fpu/bugs.c
+++ b/arch/x86/kernel/fpu/bugs.c
@@ -60,7 +60,7 @@ static void __init check_fpu(void)
}
}

-void fpu__init_check_bugs(void)
+void __init fpu__init_check_bugs(void)
{
/*
* kernel_fpu_begin/end() in check_fpu() relies on the patched
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index 889025217407..a9e506a99a83 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -90,7 +90,7 @@ static void fpu__init_system_early_generic(struct cpuinfo_x86 *c)
*/
unsigned int mxcsr_feature_mask __read_mostly = 0xffffffffu;

-static void fpu__init_system_mxcsr(void)
+static void __init fpu__init_system_mxcsr(void)
{
unsigned int mask = 0;

@@ -115,7 +115,7 @@ static void fpu__init_system_mxcsr(void)
/*
* Once per bootup FPU initialization sequences that will run on most x86 CPUs:
*/
-static void fpu__init_system_generic(void)
+static void __init fpu__init_system_generic(void)
{
/*
* Set up the legacy init FPU context. (xstate init might overwrite this
@@ -141,7 +141,7 @@ EXPORT_SYMBOL_GPL(xstate_size);
* We set this up first, and later it will be overwritten by
* fpu__init_system_xstate() if the CPU knows about xstates.
*/
-static void fpu__init_system_xstate_size_legacy(void)
+static void __init fpu__init_system_xstate_size_legacy(void)
{
/*
* Note that xstate_size might be overwriten later during
@@ -212,7 +212,7 @@ __setup("eagerfpu=", eager_fpu_setup);
/*
* Pick the FPU context switching strategy:
*/
-static void fpu__init_system_ctx_switch(void)
+static void __init fpu__init_system_ctx_switch(void)
{
WARN_ON(current->thread.fpu.fpstate_active);
current_thread_info()->status = 0;
@@ -234,14 +234,14 @@ static void fpu__init_system_ctx_switch(void)
if (eagerfpu == ENABLE)
setup_force_cpu_cap(X86_FEATURE_EAGER_FPU);

- printk_once(KERN_INFO "x86/fpu: Using '%s' FPU context switches.\n", eagerfpu == ENABLE ? "eager" : "lazy");
+ printk(KERN_INFO "x86/fpu: Using '%s' FPU context switches.\n", eagerfpu == ENABLE ? "eager" : "lazy");
}

/*
* Called on the boot CPU once per system bootup, to set up the initial
* FPU state that is later cloned into all processes:
*/
-void fpu__init_system(struct cpuinfo_x86 *c)
+void __init fpu__init_system(struct cpuinfo_x86 *c)
{
fpu__init_system_early_generic(c);

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 9e77332f00e4..201f08feb259 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -191,7 +191,7 @@ static void __init setup_xstate_features(void)
}
}

-static void print_xstate_feature(u64 xstate_mask)
+static void __init print_xstate_feature(u64 xstate_mask)
{
const char *feature_name;

@@ -202,7 +202,7 @@ static void print_xstate_feature(u64 xstate_mask)
/*
* Print out all the supported xstate features:
*/
-static void print_xstate_features(void)
+static void __init print_xstate_features(void)
{
print_xstate_feature(XSTATE_FP);
print_xstate_feature(XSTATE_SSE);
@@ -219,7 +219,7 @@ static void print_xstate_features(void)
* xsave area. This supports both standard format and compacted format
* of the xsave aread.
*/
-static void setup_xstate_comp(void)
+static void __init setup_xstate_comp(void)
{
unsigned int xstate_comp_sizes[sizeof(xfeatures_mask)*8];
int i;
@@ -260,7 +260,7 @@ static void setup_xstate_comp(void)
/*
* setup the xstate image representing the init state
*/
-static void setup_init_fpu_buf(void)
+static void __init setup_init_fpu_buf(void)
{
if (!cpu_has_xsave)
return;
@@ -314,7 +314,7 @@ static void __init init_xstate_size(void)
*
* ( Not marked __init because of false positive section warnings. )
*/
-void fpu__init_system_xstate(void)
+void __init fpu__init_system_xstate(void)
{
unsigned int eax, ebx, ecx, edx;

--
2.1.0

2015-05-05 18:01:57

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 204/208] x86/fpu: Pass 'struct fpu' to fpu__restore()

This cleans up the call sites and the function a bit,
and also makes it more symmetric with the other high
level FPU state handling functions.

It's still only valid for the current task, as we copy
to the FPU registers of the current CPU.

No change in functionality.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/fpu/internal.h | 2 +-
arch/x86/kernel/fpu/core.c | 9 +++------
arch/x86/kernel/fpu/signal.c | 2 +-
arch/x86/kernel/traps.c | 2 +-
drivers/lguest/x86/core.c | 2 +-
5 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 7fdc90b9dd86..a4c1b7dbf70e 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -50,7 +50,7 @@ extern int fpu__exception_code(struct fpu *fpu, int trap_nr);
extern void fpu__activate_curr(struct fpu *fpu);
extern void fpu__activate_stopped(struct fpu *fpu);
extern void fpu__save(struct fpu *fpu);
-extern void fpu__restore(void);
+extern void fpu__restore(struct fpu *fpu);
extern int fpu__restore_sig(void __user *buf, int ia32_frame);
extern void fpu__drop(struct fpu *fpu);
extern int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu);
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 843901f10754..421a98103820 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -343,11 +343,8 @@ void fpu__activate_stopped(struct fpu *child_fpu)
* with local interrupts disabled, as it is in the case of
* do_device_not_available()).
*/
-void fpu__restore(void)
+void fpu__restore(struct fpu *fpu)
{
- struct task_struct *tsk = current;
- struct fpu *fpu = &tsk->thread.fpu;
-
fpu__activate_curr(fpu);

/* Avoid __kernel_fpu_begin() right after fpregs_activate() */
@@ -355,9 +352,9 @@ void fpu__restore(void)
fpregs_activate(fpu);
if (unlikely(copy_fpstate_to_fpregs(fpu))) {
fpu__clear(fpu);
- force_sig_info(SIGSEGV, SEND_SIG_PRIV, tsk);
+ force_sig_info(SIGSEGV, SEND_SIG_PRIV, current);
} else {
- tsk->thread.fpu.counter++;
+ fpu->counter++;
}
kernel_fpu_enable();
}
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 99f73093333d..50ec9af1bd51 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -319,7 +319,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
fpu->fpstate_active = 1;
if (use_eager_fpu()) {
preempt_disable();
- fpu__restore();
+ fpu__restore(fpu);
preempt_enable();
}

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 6f581c65c648..a2510f230195 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -803,7 +803,7 @@ do_device_not_available(struct pt_regs *regs, long error_code)
return;
}
#endif
- fpu__restore(); /* interrupts still off */
+ fpu__restore(&current->thread.fpu); /* interrupts still off */
#ifdef CONFIG_X86_32
conditional_sti(regs);
#endif
diff --git a/drivers/lguest/x86/core.c b/drivers/lguest/x86/core.c
index 99bb3009e2d5..6a4cd771a2be 100644
--- a/drivers/lguest/x86/core.c
+++ b/drivers/lguest/x86/core.c
@@ -302,7 +302,7 @@ void lguest_arch_run_guest(struct lg_cpu *cpu)
* before this.
*/
else if (cpu->regs->trapnum == 7 && !fpregs_active())
- fpu__restore();
+ fpu__restore(&current->thread.fpu);
}

/*H:130
--
2.1.0

2015-05-05 18:01:55

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 205/208] x86/fpu: Fix the 'nofxsr' boot parameter to also clear X86_FEATURE_FXSR_OPT

I tried to simulate an ancient CPU via this option, and
found that it still has fxsr_opt enabled, confusing the
FPU code.

Make the 'nofxsr' option also clear FXSR_OPT flag.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/cpu/common.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index d15610b0a4cf..d6fe512441e5 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -171,6 +171,15 @@ static int __init x86_xsaves_setup(char *s)
}
__setup("noxsaves", x86_xsaves_setup);

+static int __init x86_fxsr_setup(char *s)
+{
+ setup_clear_cpu_cap(X86_FEATURE_FXSR);
+ setup_clear_cpu_cap(X86_FEATURE_FXSR_OPT);
+ setup_clear_cpu_cap(X86_FEATURE_XMM);
+ return 1;
+}
+__setup("nofxsr", x86_fxsr_setup);
+
#ifdef CONFIG_X86_32
static int cachesize_override = -1;
static int disable_x86_serial_nr = 1;
@@ -182,14 +191,6 @@ static int __init cachesize_setup(char *str)
}
__setup("cachesize=", cachesize_setup);

-static int __init x86_fxsr_setup(char *s)
-{
- setup_clear_cpu_cap(X86_FEATURE_FXSR);
- setup_clear_cpu_cap(X86_FEATURE_XMM);
- return 1;
-}
-__setup("nofxsr", x86_fxsr_setup);
-
static int __init x86_sep_setup(char *s)
{
setup_clear_cpu_cap(X86_FEATURE_SEP);
--
2.1.0

2015-05-05 18:01:35

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 206/208] x86/fpu: Add CONFIG_X86_DEBUG_FPU=y FPU debugging code

There are various internal FPU state debugging checks that never
trigger in practice, but which are useful for FPU code development.

Separate these out into CONFIG_X86_DEBUG_FPU=y, and also add a
couple of new ones.

The size difference is about 0.5K of code on defconfig:

text data bss filename
15028906 2578816 1638400 vmlinux
15029430 2578816 1638400 vmlinux

( Keep this enabled by default until the new FPU code is debugged. )

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/Kconfig.debug | 12 ++++++++++++
arch/x86/include/asm/fpu/internal.h | 17 ++++++++++++++++-
arch/x86/kernel/fpu/core.c | 18 +++++++++---------
arch/x86/kernel/fpu/init.c | 12 +++++++++++-
arch/x86/kernel/fpu/xstate.c | 11 ++++++++++-
5 files changed, 58 insertions(+), 12 deletions(-)

diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
index 72484a645f05..2fd3ebbb4e33 100644
--- a/arch/x86/Kconfig.debug
+++ b/arch/x86/Kconfig.debug
@@ -332,4 +332,16 @@ config X86_DEBUG_STATIC_CPU_HAS

If unsure, say N.

+config X86_DEBUG_FPU
+ bool "Debug the x86 FPU code"
+ depends on DEBUG_KERNEL
+ default y
+ ---help---
+ If this option is enabled then there will be extra sanity
+ checks and (boot time) debug printouts added to the kernel.
+ This debugging adds some small amount of runtime overhead
+ to the kernel.
+
+ If unsure, say N.
+
endmenu
diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index a4c1b7dbf70e..d2a281bd5f45 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -59,6 +59,15 @@ extern void fpu__clear(struct fpu *fpu);
extern void fpu__init_check_bugs(void);
extern void fpu__resume_cpu(void);

+/*
+ * Debugging facility:
+ */
+#ifdef CONFIG_X86_DEBUG_FPU
+# define WARN_ON_FPU(x) WARN_ON_ONCE(x)
+#else
+# define WARN_ON_FPU(x) ({ 0; })
+#endif
+
DECLARE_PER_CPU(struct fpu *, fpu_fpregs_owner_ctx);

/*
@@ -296,6 +305,8 @@ static inline void __fpregs_deactivate_hw(void)
/* Must be paired with an 'stts' (fpregs_deactivate_hw()) after! */
static inline void __fpregs_deactivate(struct fpu *fpu)
{
+ WARN_ON_FPU(!fpu->fpregs_active);
+
fpu->fpregs_active = 0;
this_cpu_write(fpu_fpregs_owner_ctx, NULL);
}
@@ -303,6 +314,8 @@ static inline void __fpregs_deactivate(struct fpu *fpu)
/* Must be paired with a 'clts' (fpregs_activate_hw()) before! */
static inline void __fpregs_activate(struct fpu *fpu)
{
+ WARN_ON_FPU(fpu->fpregs_active);
+
fpu->fpregs_active = 1;
this_cpu_write(fpu_fpregs_owner_ctx, fpu);
}
@@ -433,8 +446,10 @@ switch_fpu_prepare(struct fpu *old_fpu, struct fpu *new_fpu, int cpu)
static inline void switch_fpu_finish(struct fpu *new_fpu, fpu_switch_t fpu_switch)
{
if (fpu_switch.preload) {
- if (unlikely(copy_fpstate_to_fpregs(new_fpu)))
+ if (unlikely(copy_fpstate_to_fpregs(new_fpu))) {
+ WARN_ON_FPU(1);
fpu__clear(new_fpu);
+ }
}
}

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 421a98103820..9df2a09f1bbe 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -38,13 +38,13 @@ DEFINE_PER_CPU(struct fpu *, fpu_fpregs_owner_ctx);

static void kernel_fpu_disable(void)
{
- WARN_ON(this_cpu_read(in_kernel_fpu));
+ WARN_ON_FPU(this_cpu_read(in_kernel_fpu));
this_cpu_write(in_kernel_fpu, true);
}

static void kernel_fpu_enable(void)
{
- WARN_ON_ONCE(!this_cpu_read(in_kernel_fpu));
+ WARN_ON_FPU(!this_cpu_read(in_kernel_fpu));
this_cpu_write(in_kernel_fpu, false);
}

@@ -109,7 +109,7 @@ void __kernel_fpu_begin(void)
{
struct fpu *fpu = &current->thread.fpu;

- WARN_ON_ONCE(!irq_fpu_usable());
+ WARN_ON_FPU(!irq_fpu_usable());

kernel_fpu_disable();

@@ -127,7 +127,7 @@ void __kernel_fpu_end(void)
struct fpu *fpu = &current->thread.fpu;

if (fpu->fpregs_active) {
- if (WARN_ON(copy_fpstate_to_fpregs(fpu)))
+ if (WARN_ON_FPU(copy_fpstate_to_fpregs(fpu)))
fpu__clear(fpu);
} else {
__fpregs_deactivate_hw();
@@ -187,7 +187,7 @@ EXPORT_SYMBOL_GPL(irq_ts_restore);
*/
void fpu__save(struct fpu *fpu)
{
- WARN_ON(fpu != &current->thread.fpu);
+ WARN_ON_FPU(fpu != &current->thread.fpu);

preempt_disable();
if (fpu->fpregs_active) {
@@ -233,7 +233,7 @@ EXPORT_SYMBOL_GPL(fpstate_init);
*/
static void fpu_copy(struct fpu *dst_fpu, struct fpu *src_fpu)
{
- WARN_ON(src_fpu != &current->thread.fpu);
+ WARN_ON_FPU(src_fpu != &current->thread.fpu);

/*
* Don't let 'init optimized' areas of the XSAVE area
@@ -284,7 +284,7 @@ int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu)
*/
void fpu__activate_curr(struct fpu *fpu)
{
- WARN_ON_ONCE(fpu != &current->thread.fpu);
+ WARN_ON_FPU(fpu != &current->thread.fpu);

if (!fpu->fpstate_active) {
fpstate_init(&fpu->state);
@@ -321,7 +321,7 @@ EXPORT_SYMBOL_GPL(fpu__activate_curr);
*/
void fpu__activate_stopped(struct fpu *child_fpu)
{
- WARN_ON_ONCE(child_fpu == &current->thread.fpu);
+ WARN_ON_FPU(child_fpu == &current->thread.fpu);

if (child_fpu->fpstate_active) {
child_fpu->last_cpu = -1;
@@ -407,7 +407,7 @@ static inline void copy_init_fpstate_to_fpregs(void)
*/
void fpu__clear(struct fpu *fpu)
{
- WARN_ON_ONCE(fpu != &current->thread.fpu); /* Almost certainly an anomaly */
+ WARN_ON_FPU(fpu != &current->thread.fpu); /* Almost certainly an anomaly */

if (!use_eager_fpu()) {
/* FPU state will be reallocated lazily at the first use. */
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index a9e506a99a83..e9f1d6e62146 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -143,6 +143,11 @@ EXPORT_SYMBOL_GPL(xstate_size);
*/
static void __init fpu__init_system_xstate_size_legacy(void)
{
+ static int on_boot_cpu = 1;
+
+ WARN_ON_FPU(!on_boot_cpu);
+ on_boot_cpu = 0;
+
/*
* Note that xstate_size might be overwriten later during
* fpu__init_system_xstate().
@@ -214,7 +219,12 @@ __setup("eagerfpu=", eager_fpu_setup);
*/
static void __init fpu__init_system_ctx_switch(void)
{
- WARN_ON(current->thread.fpu.fpstate_active);
+ static bool on_boot_cpu = 1;
+
+ WARN_ON_FPU(!on_boot_cpu);
+ on_boot_cpu = 0;
+
+ WARN_ON_FPU(current->thread.fpu.fpstate_active);
current_thread_info()->status = 0;

/* Auto enable eagerfpu for xsaveopt */
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 201f08feb259..5724098adf1b 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -262,6 +262,11 @@ static void __init setup_xstate_comp(void)
*/
static void __init setup_init_fpu_buf(void)
{
+ static int on_boot_cpu = 1;
+
+ WARN_ON_FPU(!on_boot_cpu);
+ on_boot_cpu = 0;
+
if (!cpu_has_xsave)
return;

@@ -317,6 +322,10 @@ static void __init init_xstate_size(void)
void __init fpu__init_system_xstate(void)
{
unsigned int eax, ebx, ecx, edx;
+ static int on_boot_cpu = 1;
+
+ WARN_ON_FPU(!on_boot_cpu);
+ on_boot_cpu = 0;

if (!cpu_has_xsave) {
pr_info("x86/fpu: Legacy x87 FPU detected.\n");
@@ -324,7 +333,7 @@ void __init fpu__init_system_xstate(void)
}

if (boot_cpu_data.cpuid_level < XSTATE_CPUID) {
- WARN(1, "x86/fpu: XSTATE_CPUID missing!\n");
+ WARN_ON_FPU(1);
return;
}

--
2.1.0

2015-05-05 18:01:11

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 207/208] x86/fpu: Add FPU performance measurement subsystem

Add a short FPU performance suite that runs once during bootup.

It can be enabled via CONFIG_X86_DEBUG_FPU_PERFORMANCE=y.

x86/fpu:##################################################################
x86/fpu: Running FPU performance measurement suite (cache hot):
x86/fpu: Cost of: null : 108 cycles
x86/fpu:######## CPU instructions: ############################
x86/fpu: Cost of: NOP insn : 0 cycles
x86/fpu: Cost of: RDTSC insn : 12 cycles
x86/fpu: Cost of: RDMSR insn : 100 cycles
x86/fpu: Cost of: WRMSR insn : 396 cycles
x86/fpu: Cost of: CLI insn same-IF : 0 cycles
x86/fpu: Cost of: CLI insn flip-IF : 0 cycles
x86/fpu: Cost of: STI insn same-IF : 0 cycles
x86/fpu: Cost of: STI insn flip-IF : 0 cycles
x86/fpu: Cost of: PUSHF insn : 0 cycles
x86/fpu: Cost of: POPF insn same-IF : 20 cycles
x86/fpu: Cost of: POPF insn flip-IF : 28 cycles
x86/fpu:######## IRQ save/restore APIs: ############################
x86/fpu: Cost of: local_irq_save() fn : 20 cycles
x86/fpu: Cost of: local_irq_restore() fn same-IF : 24 cycles
x86/fpu: Cost of: local_irq_restore() fn flip-IF : 28 cycles
x86/fpu: Cost of: irq_save()+restore() fn same-IF : 48 cycles
x86/fpu: Cost of: irq_save()+restore() fn flip-IF : 48 cycles
x86/fpu:######## locking APIs: ############################
x86/fpu: Cost of: smp_mb() fn : 40 cycles
x86/fpu: Cost of: cpu_relax() fn : 8 cycles
x86/fpu: Cost of: spin_lock()+unlock() fn : 64 cycles
x86/fpu: Cost of: read_lock()+unlock() fn : 76 cycles
x86/fpu: Cost of: write_lock()+unlock() fn : 52 cycles
x86/fpu: Cost of: rcu_read_lock()+unlock() fn : 16 cycles
x86/fpu: Cost of: preempt_disable()+enable() fn : 20 cycles
x86/fpu: Cost of: mutex_lock()+unlock() fn : 56 cycles
x86/fpu:######## MM instructions: ############################
x86/fpu: Cost of: __flush_tlb() fn : 132 cycles
x86/fpu: Cost of: __flush_tlb_global() fn : 920 cycles
x86/fpu: Cost of: __flush_tlb_one() fn : 288 cycles
x86/fpu: Cost of: __flush_tlb_range() fn : 412 cycles
x86/fpu:######## FPU instructions: ############################
x86/fpu: Cost of: CR0 read : 4 cycles
x86/fpu: Cost of: CR0 write : 208 cycles
x86/fpu: Cost of: CR0::TS fault : 1156 cycles
x86/fpu: Cost of: FNINIT insn : 76 cycles
x86/fpu: Cost of: FWAIT insn : 0 cycles
x86/fpu: Cost of: FSAVE insn : 168 cycles
x86/fpu: Cost of: FRSTOR insn : 160 cycles
x86/fpu: Cost of: FXSAVE insn : 84 cycles
x86/fpu: Cost of: FXRSTOR insn : 44 cycles
x86/fpu: Cost of: FXRSTOR fault : 688 cycles
x86/fpu: Cost of: XSAVE insn : 104 cycles
x86/fpu: Cost of: XRSTOR insn : 80 cycles
x86/fpu: Cost of: XRSTOR fault : 884 cycles
x86/fpu:##################################################################

on an AMD system:

x86/fpu:##################################################################
x86/fpu: Running FPU performance measurement suite (cache hot):
x86/fpu: Cost of: null : 144 cycles
x86/fpu:######## CPU instructions: ############################
x86/fpu: Cost of: NOP insn : 4 cycles
x86/fpu: Cost of: RDTSC insn : 71 cycles
x86/fpu: Cost of: RDMSR insn : 43 cycles
x86/fpu: Cost of: WRMSR insn : 148 cycles
x86/fpu: Cost of: CLI insn same-IF : 8 cycles
x86/fpu: Cost of: CLI insn flip-IF : 5 cycles
x86/fpu: Cost of: STI insn same-IF : 28 cycles
x86/fpu: Cost of: STI insn flip-IF : 0 cycles
x86/fpu: Cost of: PUSHF insn : 15 cycles
x86/fpu: Cost of: POPF insn same-IF : 8 cycles
x86/fpu: Cost of: POPF insn flip-IF : 12 cycles
x86/fpu:######## IRQ save/restore APIs: ############################
x86/fpu: Cost of: local_irq_save() fn : 0 cycles
x86/fpu: Cost of: local_irq_restore() fn same-IF : 7 cycles
x86/fpu: Cost of: local_irq_restore() fn flip-IF : 20 cycles
x86/fpu: Cost of: irq_save()+restore() fn same-IF : 20 cycles
x86/fpu: Cost of: irq_save()+restore() fn flip-IF : 20 cycles
x86/fpu:######## locking APIs: ############################
x86/fpu: Cost of: smp_mb() fn : 38 cycles
x86/fpu: Cost of: cpu_relax() fn : 7 cycles
x86/fpu: Cost of: spin_lock()+unlock() fn : 89 cycles
x86/fpu: Cost of: read_lock()+unlock() fn : 91 cycles
x86/fpu: Cost of: write_lock()+unlock() fn : 85 cycles
x86/fpu: Cost of: rcu_read_lock()+unlock() fn : 30 cycles
x86/fpu: Cost of: preempt_disable()+enable() fn : 38 cycles
x86/fpu: Cost of: mutex_lock()+unlock() fn : 64 cycles
x86/fpu:######## MM instructions: ############################
x86/fpu: Cost of: __flush_tlb() fn : 134 cycles
x86/fpu: Cost of: __flush_tlb_global() fn : 547 cycles
x86/fpu: Cost of: __flush_tlb_one() fn : 128 cycles
x86/fpu: Cost of: __flush_tlb_range() fn : 539 cycles
x86/fpu:######## FPU instructions: ############################
x86/fpu: Cost of: CR0 read : 16 cycles
x86/fpu: Cost of: CR0 write : 83 cycles
x86/fpu: Cost of: CR0::TS fault : 691 cycles
x86/fpu: Cost of: FNINIT insn : 118 cycles
x86/fpu: Cost of: FWAIT insn : 4 cycles
x86/fpu: Cost of: FSAVE insn : 156 cycles
x86/fpu: Cost of: FRSTOR insn : 151 cycles
x86/fpu: Cost of: FXSAVE insn : 73 cycles
x86/fpu: Cost of: FXRSTOR insn : 86 cycles
x86/fpu: Cost of: FXRSTOR fault : 441 cycles
x86/fpu:##################################################################

Note that there can be some jitter in the results between bootups.
The measurement takes the shortest of all runs, which is relatively
but not completely stable. So for example in the above test,
NOPs obviously don't take 3 cycles. Results are expected to be
relatively accurate for more complex instructions.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/Kconfig.debug | 15 ++
arch/x86/include/asm/fpu/measure.h | 13 ++
arch/x86/kernel/cpu/bugs.c | 2 +
arch/x86/kernel/cpu/bugs_64.c | 2 +
arch/x86/kernel/fpu/Makefile | 8 +-
arch/x86/kernel/fpu/measure.c | 509 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
6 files changed, 548 insertions(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
index 2fd3ebbb4e33..8329635101f8 100644
--- a/arch/x86/Kconfig.debug
+++ b/arch/x86/Kconfig.debug
@@ -344,4 +344,19 @@ config X86_DEBUG_FPU

If unsure, say N.

+config X86_DEBUG_FPU_PERFORMANCE
+ bool "Measure x86 FPU performance"
+ depends on DEBUG_KERNEL
+ ---help---
+ If this option is enabled then the kernel will run a short
+ FPU (Floating Point Unit) benchmarking suite during bootup,
+ to measure the cost of various FPU hardware operations and
+ other kernel APIs.
+
+ The results are printed to the kernel log.
+
+ This extra benchmarking code will be freed after bootup.
+
+ If unsure, say N.
+
endmenu
diff --git a/arch/x86/include/asm/fpu/measure.h b/arch/x86/include/asm/fpu/measure.h
new file mode 100644
index 000000000000..d003809491c2
--- /dev/null
+++ b/arch/x86/include/asm/fpu/measure.h
@@ -0,0 +1,13 @@
+/*
+ * x86 FPU performance measurement methods:
+ */
+#ifndef _ASM_X86_FPU_MEASURE_H
+#define _ASM_X86_FPU_MEASURE_H
+
+#ifdef CONFIG_X86_DEBUG_FPU_PERFORMANCE
+extern void fpu__measure(void);
+#else
+static inline void fpu__measure(void) { }
+#endif
+
+#endif /* _ASM_X86_FPU_MEASURE_H */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index bd17db15a2c1..1b947415d903 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -13,6 +13,7 @@
#include <asm/processor.h>
#include <asm/processor-flags.h>
#include <asm/fpu/internal.h>
+#include <asm/fpu/measure.h>
#include <asm/msr.h>
#include <asm/paravirt.h>
#include <asm/alternative.h>
@@ -37,6 +38,7 @@ void __init check_bugs(void)

init_utsname()->machine[1] =
'0' + (boot_cpu_data.x86 > 6 ? 6 : boot_cpu_data.x86);
+ fpu__measure();
alternative_instructions();

fpu__init_check_bugs();
diff --git a/arch/x86/kernel/cpu/bugs_64.c b/arch/x86/kernel/cpu/bugs_64.c
index 04f0fe5af83e..846c24aa14cf 100644
--- a/arch/x86/kernel/cpu/bugs_64.c
+++ b/arch/x86/kernel/cpu/bugs_64.c
@@ -8,6 +8,7 @@
#include <asm/alternative.h>
#include <asm/bugs.h>
#include <asm/processor.h>
+#include <asm/fpu/measure.h>
#include <asm/mtrr.h>
#include <asm/cacheflush.h>

@@ -18,6 +19,7 @@ void __init check_bugs(void)
printk(KERN_INFO "CPU: ");
print_cpu_info(&boot_cpu_data);
#endif
+ fpu__measure();
alternative_instructions();

/*
diff --git a/arch/x86/kernel/fpu/Makefile b/arch/x86/kernel/fpu/Makefile
index 68279efb811a..e7676c20bdde 100644
--- a/arch/x86/kernel/fpu/Makefile
+++ b/arch/x86/kernel/fpu/Makefile
@@ -2,4 +2,10 @@
# Build rules for the FPU support code:
#

-obj-y += init.o bugs.o core.o regset.o signal.o xstate.o
+obj-y += init.o bugs.o core.o regset.o signal.o xstate.o
+
+# Make the measured functions as simple as possible:
+CFLAGS_measure.o += -fomit-frame-pointer
+CFLAGS_REMOVE_measure.o = -pg
+
+obj-$(CONFIG_X86_DEBUG_FPU_PERFORMANCE) += measure.o
diff --git a/arch/x86/kernel/fpu/measure.c b/arch/x86/kernel/fpu/measure.c
new file mode 100644
index 000000000000..6232cdf240d8
--- /dev/null
+++ b/arch/x86/kernel/fpu/measure.c
@@ -0,0 +1,509 @@
+/*
+ * FPU performance measurement routines
+ */
+#include <asm/fpu/internal.h>
+#include <asm/tlbflush.h>
+
+#include <linux/kernel.h>
+
+/*
+ * Number of repeated measurements we do. We pick the fastest one:
+ */
+static int loops = 1000;
+
+/*
+ * Various small functions, whose overhead we measure:
+ */
+
+typedef void (*bench_fn_t)(void) __aligned(32);
+
+static void fn_empty(void)
+{
+}
+
+/* Basic instructions: */
+
+static void fn_nop(void)
+{
+ asm volatile ("nop");
+}
+
+static void fn_rdtsc(void)
+{
+ u32 low, high;
+
+ asm volatile ("rdtsc": "=a"(low), "=d"(high));
+}
+
+static void fn_rdmsr(void)
+{
+ u64 efer;
+
+ rdmsrl_safe(MSR_EFER, &efer);
+}
+
+static void fn_wrmsr(void)
+{
+ u64 efer;
+
+ if (!rdmsrl_safe(MSR_EFER, &efer))
+ wrmsrl_safe(MSR_EFER, efer);
+}
+
+static void fn_cli_same(void)
+{
+ asm volatile ("cli");
+}
+
+static void fn_cli_flip(void)
+{
+ asm volatile ("sti");
+ asm volatile ("cli");
+}
+
+static void fn_sti_same(void)
+{
+ asm volatile ("sti");
+}
+
+static void fn_sti_flip(void)
+{
+ asm volatile ("cli");
+ asm volatile ("sti");
+}
+
+static void fn_pushf(void)
+{
+ arch_local_save_flags();
+}
+
+static void fn_popf_baseline(void)
+{
+ arch_local_save_flags();
+ asm volatile ("cli");
+}
+
+static void fn_popf_flip(void)
+{
+ unsigned long flags = arch_local_save_flags();
+ asm volatile ("cli");
+
+ arch_local_irq_restore(flags);
+}
+
+static void fn_popf_same(void)
+{
+ unsigned long flags = arch_local_save_flags();
+
+ arch_local_irq_restore(flags);
+}
+
+/* Basic IRQ save/restore APIs: */
+
+static void fn_irq_save_baseline(void)
+{
+ local_irq_enable();
+}
+
+static void fn_irq_save(void)
+{
+ unsigned long flags;
+
+ local_irq_enable();
+ local_irq_save(flags);
+}
+
+static void fn_irq_restore_flip(void)
+{
+ unsigned long flags;
+
+ local_irq_enable();
+ local_irq_save(flags);
+ local_irq_restore(flags);
+}
+
+static void fn_irq_restore_same(void)
+{
+ unsigned long flags;
+
+ local_irq_disable();
+ local_irq_save(flags);
+ local_irq_restore(flags);
+}
+
+static void fn_irq_save_restore_flip(void)
+{
+ unsigned long flags;
+
+ local_irq_enable();
+
+ local_irq_save(flags);
+ local_irq_restore(flags);
+}
+
+static void fn_irq_save_restore_same(void)
+{
+ unsigned long flags;
+
+ local_irq_enable();
+
+ local_irq_save(flags);
+ local_irq_restore(flags);
+}
+
+/* Basic locking primitives: */
+
+static void fn_smp_mb(void)
+{
+ smp_mb();
+}
+
+static void fn_cpu_relax(void)
+{
+ cpu_relax();
+}
+
+static DEFINE_SPINLOCK(test_spinlock);
+
+static void fn_spin_lock_unlock(void)
+{
+ spin_lock(&test_spinlock);
+ spin_unlock(&test_spinlock);
+}
+
+static DEFINE_RWLOCK(test_rwlock);
+
+static void fn_read_lock_unlock(void)
+{
+ read_lock(&test_rwlock);
+ read_unlock(&test_rwlock);
+}
+
+static void fn_write_lock_unlock(void)
+{
+ write_lock(&test_rwlock);
+ write_unlock(&test_rwlock);
+}
+
+static void fn_rcu_read_lock_unlock(void)
+{
+ rcu_read_lock();
+ rcu_read_unlock();
+}
+
+static void fn_preempt_disable_enable(void)
+{
+ preempt_disable();
+ preempt_enable();
+}
+
+static DEFINE_MUTEX(test_mutex);
+
+static void fn_mutex_lock_unlock(void)
+{
+ local_irq_enable();
+
+ mutex_lock(&test_mutex);
+ mutex_unlock(&test_mutex);
+}
+
+/* MM instructions: */
+
+static void fn_flush_tlb(void)
+{
+ __flush_tlb();
+}
+
+static void fn_flush_tlb_global(void)
+{
+ __flush_tlb_global();
+}
+
+static char tlb_flush_target[PAGE_SIZE] __aligned(4096);
+
+static void fn_flush_tlb_one(void)
+{
+ unsigned long addr = (unsigned long)&tlb_flush_target;
+
+ tlb_flush_target[0]++;
+ __flush_tlb_one(addr);
+}
+
+static void fn_flush_tlb_range(void)
+{
+ unsigned long start = (unsigned long)&tlb_flush_target;
+ unsigned long end = start+PAGE_SIZE;
+ struct mm_struct *mm_saved;
+
+ tlb_flush_target[0]++;
+
+ mm_saved = current->mm;
+ current->mm = current->active_mm;
+
+ flush_tlb_mm_range(current->active_mm, start, end, 0);
+
+ current->mm = mm_saved;
+}
+
+/* FPU instructions: */
+/* FPU instructions: */
+
+static void fn_read_cr0(void)
+{
+ read_cr0();
+}
+
+static void fn_rw_cr0(void)
+{
+ write_cr0(read_cr0());
+}
+
+static void fn_cr0_fault(void)
+{
+ struct fpu *fpu = &current->thread.fpu;
+ u32 cr0 = read_cr0();
+
+ write_cr0(cr0 | X86_CR0_TS);
+
+ asm volatile("fwait");
+
+ /* Zap the FP state we created via the fault: */
+ fpu->fpregs_active = 0;
+ fpu->fpstate_active = 0;
+
+ write_cr0(cr0);
+}
+
+static void fn_fninit(void)
+{
+ asm volatile ("fninit");
+}
+
+static void fn_fwait(void)
+{
+ asm volatile("fwait");
+}
+
+static void fn_fsave(void)
+{
+ static struct fregs_state fstate __aligned(32);
+
+ copy_fregs_to_user(&fstate);
+}
+
+static void fn_frstor(void)
+{
+ static struct fregs_state fstate __aligned(32);
+
+ copy_fregs_to_user(&fstate);
+ copy_user_to_fregs(&fstate);
+}
+
+static void fn_fxsave(void)
+{
+ struct fxregs_state fxstate __aligned(32);
+
+ copy_fxregs_to_user(&fxstate);
+}
+
+static void fn_fxrstor(void)
+{
+ static struct fxregs_state fxstate __aligned(32);
+
+ copy_fxregs_to_user(&fxstate);
+ copy_user_to_fxregs(&fxstate);
+}
+
+/*
+ * Provoke #GP on invalid FXRSTOR:
+ */
+static void fn_fxrstor_fault(void)
+{
+ static struct fxregs_state fxstate __aligned(32);
+ struct fpu *fpu = &current->thread.fpu;
+
+ copy_fxregs_to_user(&fxstate);
+
+ /* Set invalid MXCSR value, this will generate a #GP: */
+ fxstate.mxcsr = -1;
+
+ copy_user_to_fxregs(&fxstate);
+
+ /* Zap any FP state we created via the fault: */
+ fpu->fpregs_active = 0;
+ fpu->fpstate_active = 0;
+}
+
+static void fn_xsave(void)
+{
+ static struct xregs_state x __aligned(32);
+
+ copy_xregs_to_kernel_booting(&x);
+}
+
+static void fn_xrstor(void)
+{
+ static struct xregs_state x __aligned(32);
+
+ copy_xregs_to_kernel_booting(&x);
+ copy_kernel_to_xregs_booting(&x, -1);
+}
+
+/*
+ * Provoke #GP on invalid XRSTOR:
+ */
+static void fn_xrstor_fault(void)
+{
+ static struct xregs_state x __aligned(32);
+
+ copy_xregs_to_kernel_booting(&x);
+
+ /* Set invalid MXCSR value, this will generate a #GP: */
+ x.i387.mxcsr = -1;
+
+ copy_kernel_to_xregs_booting(&x, -1);
+}
+
+static s64
+measure(s64 null_overhead, bench_fn_t bench_fn,
+ const char *txt_1, const char *txt_2, const char *txt_3)
+{
+ unsigned long flags;
+ u32 cr0_saved;
+ int eager_saved;
+ u64 t0, t1;
+ s64 delta, delta_min;
+ int i;
+
+ delta_min = LONG_MAX;
+
+ /* Disable eagerfpu, so that we can provoke CR0::TS faults: */
+ eager_saved = boot_cpu_has(X86_FEATURE_EAGER_FPU);
+ setup_clear_cpu_cap(X86_FEATURE_EAGER_FPU);
+
+ /* Save CR0 so that we can freely set it to any value during measurement: */
+ cr0_saved = read_cr0();
+ /* Clear TS, so that we can measure FPU ops by default: */
+ write_cr0(cr0_saved & ~X86_CR0_TS);
+
+ local_irq_save(flags);
+
+ asm volatile (".align 32\n");
+
+ for (i = 0; i < loops; i++) {
+ rdtscll(t0);
+ mb();
+
+ bench_fn();
+
+ mb();
+ rdtscll(t1);
+ delta = t1-t0;
+ if (delta <= 0)
+ continue;
+
+ delta_min = min(delta_min, delta);
+ }
+
+ local_irq_restore(flags);
+ write_cr0(cr0_saved);
+
+ if (eager_saved)
+ setup_force_cpu_cap(X86_FEATURE_EAGER_FPU);
+
+ delta_min = max(0LL, delta_min-null_overhead);
+
+ if (txt_1) {
+ if (!txt_2)
+ txt_2 = "";
+ if (!txt_3)
+ txt_3 = "";
+ pr_info("x86/fpu: Cost of: %-27s %-5s %-8s: %5Ld cycles\n", txt_1, txt_2, txt_3, delta_min);
+ }
+
+ return delta_min;
+}
+
+/*
+ * Measure all the above primitives:
+ */
+void __init fpu__measure(void)
+{
+ s64 cost;
+ s64 rdmsr_cost;
+ s64 cli_cost, sti_cost, popf_cost, irq_save_cost;
+ s64 cr0_read_cost, cr0_write_cost;
+ s64 save_cost;
+
+ pr_info("x86/fpu:##################################################################\n");
+ pr_info("x86/fpu: Running FPU performance measurement suite (cache hot):\n");
+
+ cost = measure(0, fn_empty, "null", NULL, NULL);
+
+ pr_info("x86/fpu:######## CPU instructions: ############################\n");
+ measure(cost, fn_nop, "NOP", "insn", NULL);
+ measure(cost, fn_rdtsc, "RDTSC", "insn", NULL);
+
+ rdmsr_cost = measure(cost, fn_rdmsr, "RDMSR", "insn", NULL);
+ measure(cost+rdmsr_cost, fn_wrmsr,"WRMSR", "insn", NULL);
+
+ cli_cost = measure(cost, fn_cli_same, "CLI", "insn", "same-IF");
+ measure(cost+cli_cost, fn_cli_flip, "CLI", "insn", "flip-IF");
+
+ sti_cost = measure(cost, fn_sti_same, "STI", "insn", "same-IF");
+ measure(cost+sti_cost, fn_sti_flip, "STI", "insn", "flip-IF");
+
+ measure(cost, fn_pushf, "PUSHF", "insn", NULL);
+
+ popf_cost = measure(cost, fn_popf_baseline, NULL, NULL, NULL);
+ measure(cost+popf_cost, fn_popf_same, "POPF", "insn", "same-IF");
+ measure(cost+popf_cost, fn_popf_flip, "POPF", "insn", "flip-IF");
+
+ pr_info("x86/fpu:######## IRQ save/restore APIs: ############################\n");
+ irq_save_cost = measure(cost, fn_irq_save_baseline, NULL, NULL, NULL);
+ irq_save_cost += measure(cost+irq_save_cost, fn_irq_save, "local_irq_save()", "fn", NULL);
+ measure(cost+irq_save_cost, fn_irq_restore_same, "local_irq_restore()", "fn", "same-IF");
+ measure(cost+irq_save_cost, fn_irq_restore_flip, "local_irq_restore()", "fn", "flip-IF");
+ measure(cost+sti_cost, fn_irq_save_restore_same, "irq_save()+restore()", "fn", "same-IF");
+ measure(cost+sti_cost, fn_irq_save_restore_flip, "irq_save()+restore()", "fn", "flip-IF");
+
+ pr_info("x86/fpu:######## locking APIs: ############################\n");
+ measure(cost, fn_smp_mb, "smp_mb()", "fn", NULL);
+ measure(cost, fn_cpu_relax, "cpu_relax()", "fn", NULL);
+ measure(cost, fn_spin_lock_unlock, "spin_lock()+unlock()", "fn", NULL);
+ measure(cost, fn_read_lock_unlock, "read_lock()+unlock()", "fn", NULL);
+ measure(cost, fn_write_lock_unlock, "write_lock()+unlock()", "fn", NULL);
+ measure(cost, fn_rcu_read_lock_unlock, "rcu_read_lock()+unlock()", "fn", NULL);
+ measure(cost, fn_preempt_disable_enable, "preempt_disable()+enable()", "fn", NULL);
+ measure(cost+sti_cost, fn_mutex_lock_unlock, "mutex_lock()+unlock()", "fn", NULL);
+
+ pr_info("x86/fpu:######## MM instructions: ############################\n");
+ measure(cost, fn_flush_tlb, "__flush_tlb()", "fn", NULL);
+ measure(cost, fn_flush_tlb_global, "__flush_tlb_global()", "fn", NULL);
+ measure(cost, fn_flush_tlb_one, "__flush_tlb_one()", "fn", NULL);
+ measure(cost, fn_flush_tlb_range, "__flush_tlb_range()", "fn", NULL);
+
+ pr_info("x86/fpu:######## FPU instructions: ############################\n");
+ cr0_read_cost = measure(cost, fn_read_cr0, "CR0", "read", NULL);
+ cr0_write_cost = measure(cost+cr0_read_cost, fn_rw_cr0, "CR0", "write", NULL);
+
+ measure(cost+cr0_read_cost+cr0_write_cost, fn_cr0_fault, "CR0::TS", "fault", NULL);
+
+ measure(cost, fn_fninit, "FNINIT", "insn", NULL);
+ measure(cost, fn_fwait, "FWAIT", "insn", NULL);
+
+ save_cost = measure(cost, fn_fsave, "FSAVE", "insn", NULL);
+ measure(cost+save_cost, fn_frstor, "FRSTOR", "insn", NULL);
+
+ if (cpu_has_fxsr) {
+ save_cost = measure(cost, fn_fxsave, "FXSAVE", "insn", NULL);
+ measure(cost+save_cost, fn_fxrstor, "FXRSTOR", "insn", NULL);
+ measure(cost+save_cost, fn_fxrstor_fault,"FXRSTOR", "fault", NULL);
+ }
+ if (cpu_has_xsaveopt) {
+ save_cost = measure(cost, fn_xsave, "XSAVE", "insn", NULL);
+ measure(cost+save_cost, fn_xrstor, "XRSTOR", "insn", NULL);
+ measure(cost+save_cost, fn_xrstor_fault, "XRSTOR", "fault", NULL);
+ }
+ pr_info("x86/fpu:##################################################################\n");
+}
--
2.1.0

2015-05-05 18:00:45

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 208/208] x86/fpu: Reorganize fpu/internal.h

fpu/internal.h has grown organically, with not much high level structure,
which hurts its readability.

Organize the various definitions into 5 sections:

- high level FPU state functions
- FPU/CPU feature flag helpers
- fpstate handling functions
- FPU context switching helpers
- misc helper functions

Other related changes:

- Move MXCSR_DEFAULT to fpu/types.h.
- drop the unused X87_FSW_ES define

No change in functionality.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/fpu/internal.h | 159 ++++++++++++++++++++++++++++++--------------------------
arch/x86/include/asm/fpu/types.h | 3 ++
2 files changed, 87 insertions(+), 75 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index d2a281bd5f45..a98a08d1efa9 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -18,32 +18,6 @@
#include <asm/fpu/api.h>
#include <asm/fpu/xstate.h>

-#define MXCSR_DEFAULT 0x1f80
-
-extern unsigned int mxcsr_feature_mask;
-
-extern union fpregs_state init_fpstate;
-
-extern void fpu__init_cpu(void);
-extern void fpu__init_system_xstate(void);
-extern void fpu__init_cpu_xstate(void);
-extern void fpu__init_system(struct cpuinfo_x86 *c);
-
-extern void fpstate_init(union fpregs_state *state);
-#ifdef CONFIG_MATH_EMULATION
-extern void fpstate_init_soft(struct swregs_state *soft);
-#else
-static inline void fpstate_init_soft(struct swregs_state *soft) {}
-#endif
-static inline void fpstate_init_fxstate(struct fxregs_state *fx)
-{
- fx->cwd = 0x37f;
- fx->mxcsr = MXCSR_DEFAULT;
-}
-
-extern int dump_fpu(struct pt_regs *, struct user_i387_struct *);
-extern int fpu__exception_code(struct fpu *fpu, int trap_nr);
-
/*
* High level FPU state handling functions:
*/
@@ -55,7 +29,16 @@ extern int fpu__restore_sig(void __user *buf, int ia32_frame);
extern void fpu__drop(struct fpu *fpu);
extern int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu);
extern void fpu__clear(struct fpu *fpu);
+extern int fpu__exception_code(struct fpu *fpu, int trap_nr);
+extern int dump_fpu(struct pt_regs *ptregs, struct user_i387_struct *fpstate);

+/*
+ * Boot time FPU initialization functions:
+ */
+extern void fpu__init_cpu(void);
+extern void fpu__init_system_xstate(void);
+extern void fpu__init_cpu_xstate(void);
+extern void fpu__init_system(struct cpuinfo_x86 *c);
extern void fpu__init_check_bugs(void);
extern void fpu__resume_cpu(void);

@@ -68,27 +51,9 @@ extern void fpu__resume_cpu(void);
# define WARN_ON_FPU(x) ({ 0; })
#endif

-DECLARE_PER_CPU(struct fpu *, fpu_fpregs_owner_ctx);
-
/*
- * Must be run with preemption disabled: this clears the fpu_fpregs_owner_ctx,
- * on this CPU.
- *
- * This will disable any lazy FPU state restore of the current FPU state,
- * but if the current thread owns the FPU, it will still be saved by.
+ * FPU related CPU feature flag helper routines:
*/
-static inline void __cpu_disable_lazy_restore(unsigned int cpu)
-{
- per_cpu(fpu_fpregs_owner_ctx, cpu) = NULL;
-}
-
-static inline int fpu_want_lazy_restore(struct fpu *fpu, unsigned int cpu)
-{
- return fpu == this_cpu_read_stable(fpu_fpregs_owner_ctx) && cpu == fpu->last_cpu;
-}
-
-#define X87_FSW_ES (1 << 7) /* Exception Summary */
-
static __always_inline __pure bool use_eager_fpu(void)
{
return static_cpu_has_safe(X86_FEATURE_EAGER_FPU);
@@ -109,6 +74,23 @@ static __always_inline __pure bool use_fxsr(void)
return static_cpu_has_safe(X86_FEATURE_FXSR);
}

+/*
+ * fpstate handling functions:
+ */
+
+extern union fpregs_state init_fpstate;
+
+extern void fpstate_init(union fpregs_state *state);
+#ifdef CONFIG_MATH_EMULATION
+extern void fpstate_init_soft(struct swregs_state *soft);
+#else
+static inline void fpstate_init_soft(struct swregs_state *soft) {}
+#endif
+static inline void fpstate_init_fxstate(struct fxregs_state *fx)
+{
+ fx->cwd = 0x37f;
+ fx->mxcsr = MXCSR_DEFAULT;
+}
extern void fpstate_sanitize_xstate(struct fpu *fpu);

#define user_insn(insn, output, input...) \
@@ -285,6 +267,32 @@ static inline int copy_fpstate_to_fpregs(struct fpu *fpu)
return __copy_fpstate_to_fpregs(fpu);
}

+extern int copy_fpstate_to_sigframe(void __user *buf, void __user *fx, int size);
+
+/*
+ * FPU context switch related helper methods:
+ */
+
+DECLARE_PER_CPU(struct fpu *, fpu_fpregs_owner_ctx);
+
+/*
+ * Must be run with preemption disabled: this clears the fpu_fpregs_owner_ctx,
+ * on this CPU.
+ *
+ * This will disable any lazy FPU state restore of the current FPU state,
+ * but if the current thread owns the FPU, it will still be saved by.
+ */
+static inline void __cpu_disable_lazy_restore(unsigned int cpu)
+{
+ per_cpu(fpu_fpregs_owner_ctx, cpu) = NULL;
+}
+
+static inline int fpu_want_lazy_restore(struct fpu *fpu, unsigned int cpu)
+{
+ return fpu == this_cpu_read_stable(fpu_fpregs_owner_ctx) && cpu == fpu->last_cpu;
+}
+
+
/*
* Wrap lazy FPU TS handling in a 'hw fpregs activation/deactivation'
* idiom, which is then paired with the sw-flag (fpregs_active) later on:
@@ -355,31 +363,6 @@ static inline void fpregs_deactivate(struct fpu *fpu)
}

/*
- * Definitions for the eXtended Control Register instructions
- */
-
-#define XCR_XFEATURE_ENABLED_MASK 0x00000000
-
-static inline u64 xgetbv(u32 index)
-{
- u32 eax, edx;
-
- asm volatile(".byte 0x0f,0x01,0xd0" /* xgetbv */
- : "=a" (eax), "=d" (edx)
- : "c" (index));
- return eax + ((u64)edx << 32);
-}
-
-static inline void xsetbv(u32 index, u64 value)
-{
- u32 eax = value;
- u32 edx = value >> 32;
-
- asm volatile(".byte 0x0f,0x01,0xd1" /* xsetbv */
- : : "a" (eax), "d" (edx), "c" (index));
-}
-
-/*
* FPU state switching for scheduling.
*
* This is a two-stage process:
@@ -438,6 +421,10 @@ switch_fpu_prepare(struct fpu *old_fpu, struct fpu *new_fpu, int cpu)
}

/*
+ * Misc helper functions:
+ */
+
+/*
* By the time this gets called, we've already cleared CR0.TS and
* given the process the FPU if we are going to preload the FPU
* state - all we need to do is to conditionally restore the register
@@ -454,11 +441,6 @@ static inline void switch_fpu_finish(struct fpu *new_fpu, fpu_switch_t fpu_switc
}

/*
- * Signal frame handlers...
- */
-extern int copy_fpstate_to_sigframe(void __user *buf, void __user *fx, int size);
-
-/*
* Needs to be preemption-safe.
*
* NOTE! user_fpu_begin() must be used only immediately before restoring
@@ -476,4 +458,31 @@ static inline void user_fpu_begin(void)
preempt_enable();
}

+/*
+ * MXCSR and XCR definitions:
+ */
+
+extern unsigned int mxcsr_feature_mask;
+
+#define XCR_XFEATURE_ENABLED_MASK 0x00000000
+
+static inline u64 xgetbv(u32 index)
+{
+ u32 eax, edx;
+
+ asm volatile(".byte 0x0f,0x01,0xd0" /* xgetbv */
+ : "=a" (eax), "=d" (edx)
+ : "c" (index));
+ return eax + ((u64)edx << 32);
+}
+
+static inline void xsetbv(u32 index, u64 value)
+{
+ u32 eax = value;
+ u32 edx = value >> 32;
+
+ asm volatile(".byte 0x0f,0x01,0xd1" /* xsetbv */
+ : : "a" (eax), "d" (edx), "c" (index));
+}
+
#endif /* _ASM_X86_FPU_INTERNAL_H */
diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index 02241c2a10e9..0637826292de 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -65,6 +65,9 @@ struct fxregs_state {

} __attribute__((aligned(16)));

+/* Default value for fxregs_state.mxcsr: */
+#define MXCSR_DEFAULT 0x1f80
+
/*
* Software based FPU emulation state. This is arbitrary really,
* it matches the x87 format to make it easier to understand:
--
2.1.0

2015-05-05 19:15:04

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH 207/208] x86/fpu: Add FPU performance measurement subsystem

On 05/05/2015 10:58 AM, Ingo Molnar wrote:
> x86/fpu: Cost of: XSAVE insn : 104 cycles
> x86/fpu: Cost of: XRSTOR insn : 80 cycles

Isn't there going to be pretty huge variability here depending on how
much state you are xsave/xrstor'ing and if the init/modified
optimizations are in play?

2015-05-05 19:22:34

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 207/208] x86/fpu: Add FPU performance measurement subsystem

On Tue, May 05, 2015 at 12:15:00PM -0700, Dave Hansen wrote:
> On 05/05/2015 10:58 AM, Ingo Molnar wrote:
> > x86/fpu: Cost of: XSAVE insn : 104 cycles
> > x86/fpu: Cost of: XRSTOR insn : 80 cycles
>
> Isn't there going to be pretty huge variability here depending on how
> much state you are xsave/xrstor'ing and if the init/modified
> optimizations are in play?

If this is a module, one could modprobe/rmmod it multiple times for
an average. But yeah, it would depend in the end on what the system
does/has been doing in the recent past and thus how much state has been
"accumulated".

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2015-05-05 19:41:23

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 206/208] x86/fpu: Add CONFIG_X86_DEBUG_FPU=y FPU debugging code

On Tue, May 05, 2015 at 07:58:30PM +0200, Ingo Molnar wrote:
> There are various internal FPU state debugging checks that never
> trigger in practice, but which are useful for FPU code development.
>
> Separate these out into CONFIG_X86_DEBUG_FPU=y, and also add a
> couple of new ones.
>
> The size difference is about 0.5K of code on defconfig:
>
> text data bss filename
> 15028906 2578816 1638400 vmlinux
> 15029430 2578816 1638400 vmlinux
>
> ( Keep this enabled by default until the new FPU code is debugged. )
>
> Cc: Andy Lutomirski <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: Fenghua Yu <[email protected]>
> Cc: H. Peter Anvin <[email protected]>
> Cc: Linus Torvalds <[email protected]>
> Cc: Oleg Nesterov <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Signed-off-by: Ingo Molnar <[email protected]>

...

> diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
> index a4c1b7dbf70e..d2a281bd5f45 100644
> --- a/arch/x86/include/asm/fpu/internal.h
> +++ b/arch/x86/include/asm/fpu/internal.h
> @@ -59,6 +59,15 @@ extern void fpu__clear(struct fpu *fpu);
> extern void fpu__init_check_bugs(void);
> extern void fpu__resume_cpu(void);
>
> +/*
> + * Debugging facility:
> + */
> +#ifdef CONFIG_X86_DEBUG_FPU
> +# define WARN_ON_FPU(x) WARN_ON_ONCE(x)
> +#else
> +# define WARN_ON_FPU(x) ({ 0; })

Shouldn't this be called FPU_WARN_ON() ?

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2015-05-05 19:52:29

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH 198/208] x86/fpu: Document the various fpregs state formats

On 05/05/2015 10:58 AM, Ingo Molnar wrote:
> +/*
> + * This is our most modern FPU state format, as saved by the XSAVE
> + * and restored by the XRSTOR instructions.
> + *
> + * It consists of a legacy fxregs portion, an xstate header and
> + * subsequent fixed size areas as defined by the xstate header.
> + * Not all CPUs support all the extensions.
> + */
> struct xregs_state {
> struct fxregs_state i387;
> struct xstate_header header;
> @@ -150,6 +169,13 @@ struct xregs_state {
> /* New processor state extensions will go here. */
> } __attribute__ ((packed, aligned (64)));

Fenghua has a "fix" for this, but I think this misses a pretty big point.

This structure includes only the "legacy" state, followed by the header.
The remainder of the layout here is enumerated in CPUID leaves and can
not be laid out in a structure because we do not know what it looks like
until we run CPUID.

There is logically a variable length array at the end of this sucker.

2015-05-05 20:11:25

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH 200/208] x86/fpu/xstate: Don't assume the first zero xfeatures zero bit means the end

On 05/05/2015 10:58 AM, Ingo Molnar wrote:
> - do {
> + for (leaf = 2; leaf < xfeatures_nr; leaf++) {
> cpuid_count(XSTATE_CPUID, leaf, &eax, &ebx, &ecx, &edx);
>
> - if (eax == 0)
> - break;
> -
> xstate_offsets[leaf] = ebx;
> xstate_sizes[leaf] = eax;
>
> + printk(KERN_INFO "x86/fpu: xstate_offset[%d]: %04x, xstate_sizes[%d]: %04x\n", leaf, ebx, leaf, eax);
> leaf++;
> - } while (1);
> + }

We're going to have to revisit this. There's a new SDM out that makes
this incorrect. It says:

"If support for state component i is limited to XSAVES and XRSTORS, EBX
returns 0."

2015-05-05 20:12:56

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH 201/208] x86/fpu: Clean up xstate feature reservation

On 05/05/2015 10:58 AM, Ingo Molnar wrote:
> struct xregs_state {
> struct fxregs_state i387;
> struct xstate_header header;
> - struct ymmh_struct ymmh;
> - struct lwp_struct lwp;
> - struct bndreg bndreg[4];
> - struct bndcsr bndcsr;
> - /* New processor state extensions will go here. */
> + u8 __reserved[XSTATE_RESERVE];
> } __attribute__ ((packed, aligned (64)));

Just to reiterate. The size of 'XSTATE_RESERVE' is completely unknown
at compile time. It's wrong in the existing kernel, but we should fix
it up instead of mucking with it like this.

2015-05-05 22:48:08

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 176/208] x86/alternatives, x86/fpu: Add 'alternatives_patched' debug flag and use it in xsave_state()

On Tue, May 05, 2015 at 07:58:00PM +0200, Ingo Molnar wrote:
> We'd like to use xsave_state() earlier, but its SYSTEM_BOOTING check
> is too imprecise.
>
> The real condition that xsave_state() would like to check is whether
> alternative XSAVE instructions were patched into the kernel image
> already.
>
> Add such a (read-mostly) debug flag and use it in xsave_state().
>
> Cc: Andy Lutomirski <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: Fenghua Yu <[email protected]>
> Cc: H. Peter Anvin <[email protected]>
> Cc: Linus Torvalds <[email protected]>
> Cc: Oleg Nesterov <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Signed-off-by: Ingo Molnar <[email protected]>
> ---
> arch/x86/include/asm/alternative.h | 6 ++++++
> arch/x86/include/asm/fpu/xstate.h | 2 +-
> arch/x86/kernel/alternative.c | 5 +++++
> 3 files changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
> index ba32af062f61..7bfc85bbb8ff 100644
> --- a/arch/x86/include/asm/alternative.h
> +++ b/arch/x86/include/asm/alternative.h
> @@ -52,6 +52,12 @@ struct alt_instr {
> u8 padlen; /* length of build-time padding */
> } __packed;
>
> +/*
> + * Debug flag that can be tested to see whether alternative
> + * instructions were patched in already:
> + */
> +extern int alternatives_patched;

Looks useful...

> +
> extern void alternative_instructions(void);
> extern void apply_alternatives(struct alt_instr *start, struct alt_instr *end);
>
> diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
> index 31a002ad5aeb..ab2c507b58b6 100644
> --- a/arch/x86/include/asm/fpu/xstate.h
> +++ b/arch/x86/include/asm/fpu/xstate.h
> @@ -119,7 +119,7 @@ static inline int xsave_state(struct xsave_struct *fx)
> u32 hmask = mask >> 32;
> int err = 0;
>
> - WARN_ON(system_state == SYSTEM_BOOTING);
> + WARN_ON(!alternatives_patched);

Btw, shouldn't we be doing this check in what is called now
copy_kernel_to_xregs() too, i.e., right under this function which is now
called copy_xregs_to_kernel()?

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2015-05-05 22:56:21

by Fenghua Yu

[permalink] [raw]
Subject: RE: [PATCH 198/208] x86/fpu: Document the various fpregs state formats

> From: Dave Hansen [mailto:[email protected]]
> Sent: Tuesday, May 05, 2015 12:52 PM
> To: Ingo Molnar; [email protected]
> Cc: Andy Lutomirski; Borislav Petkov; Yu, Fenghua; H. Peter Anvin; Linus
> Torvalds; Oleg Nesterov; Thomas Gleixner
> Subject: Re: [PATCH 198/208] x86/fpu: Document the various fpregs state
> formats
>
> On 05/05/2015 10:58 AM, Ingo Molnar wrote:
> > +/*
> > + * This is our most modern FPU state format, as saved by the XSAVE
> > + * and restored by the XRSTOR instructions.
> > + *
> > + * It consists of a legacy fxregs portion, an xstate header and
> > + * subsequent fixed size areas as defined by the xstate header.
> > + * Not all CPUs support all the extensions.
> > + */
> > struct xregs_state {
> > struct fxregs_state i387;
> > struct xstate_header header;
> > @@ -150,6 +169,13 @@ struct xregs_state {
> > /* New processor state extensions will go here. */ } __attribute__
> > ((packed, aligned (64)));
>
> Fenghua has a "fix" for this, but I think this misses a pretty big point.

Here is the "fix" patch Dave referred to: https://lkml.org/lkml/2015/4/22/9

>
> This structure includes only the "legacy" state, followed by the header.
> The remainder of the layout here is enumerated in CPUID leaves and can not
> be laid out in a structure because we do not know what it looks like until we
> run CPUID.
>
> There is logically a variable length array at the end of this sucker.

2015-05-05 23:04:50

by Fenghua Yu

[permalink] [raw]
Subject: RE: [PATCH 200/208] x86/fpu/xstate: Don't assume the first zero xfeatures zero bit means the end

> From: Ingo Molnar [mailto:[email protected]] On Behalf Of Ingo
> Molnar
> Sent: Tuesday, May 05, 2015 10:58 AM
> To: [email protected]
> Cc: Andy Lutomirski; Borislav Petkov; Dave Hansen; Yu, Fenghua; H. Peter
> Anvin; Linus Torvalds; Oleg Nesterov; Thomas Gleixner
> Subject: [PATCH 200/208] x86/fpu/xstate: Don't assume the first zero
> xfeatures zero bit means the end
>
> The current xstate code in setup_xstate_features() assumes that the first
> zero bit means the end of xfeatures - but that is not so, the SDM clearly
> states that an arbitrary set of xfeatures might be enabled - and it is also clear
> from the description of the compaction feature that holes are possible:

A previous patch in lkml has (exactly) the same fix:
http://lists-archives.com/linux-kernel/28292115-x86-xsave-c-fix-xstate-offsets-and-sizes-enumeration.html

Thanks.

-Fenghua


????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2015-05-06 00:53:08

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 207/208] x86/fpu: Add FPU performance measurement subsystem

On May 5, 2015 11:30 PM, "Ingo Molnar" <[email protected]> wrote:
>
> Add a short FPU performance suite that runs once during bootup.
>
> It can be enabled via CONFIG_X86_DEBUG_FPU_PERFORMANCE=y.

Neat!

Can you change "cycles" to "TSC ticks"? They're not quite the same thing.

--Andy

2015-05-06 02:57:51

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 176/208] x86/alternatives, x86/fpu: Add 'alternatives_patched' debug flag and use it in xsave_state()


* Borislav Petkov <[email protected]> wrote:

> On Tue, May 05, 2015 at 07:58:00PM +0200, Ingo Molnar wrote:
> > We'd like to use xsave_state() earlier, but its SYSTEM_BOOTING check
> > is too imprecise.
> >
> > The real condition that xsave_state() would like to check is whether
> > alternative XSAVE instructions were patched into the kernel image
> > already.
> >
> > Add such a (read-mostly) debug flag and use it in xsave_state().
> >
> > Cc: Andy Lutomirski <[email protected]>
> > Cc: Borislav Petkov <[email protected]>
> > Cc: Dave Hansen <[email protected]>
> > Cc: Fenghua Yu <[email protected]>
> > Cc: H. Peter Anvin <[email protected]>
> > Cc: Linus Torvalds <[email protected]>
> > Cc: Oleg Nesterov <[email protected]>
> > Cc: Thomas Gleixner <[email protected]>
> > Signed-off-by: Ingo Molnar <[email protected]>
> > ---
> > arch/x86/include/asm/alternative.h | 6 ++++++
> > arch/x86/include/asm/fpu/xstate.h | 2 +-
> > arch/x86/kernel/alternative.c | 5 +++++
> > 3 files changed, 12 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
> > index ba32af062f61..7bfc85bbb8ff 100644
> > --- a/arch/x86/include/asm/alternative.h
> > +++ b/arch/x86/include/asm/alternative.h
> > @@ -52,6 +52,12 @@ struct alt_instr {
> > u8 padlen; /* length of build-time padding */
> > } __packed;
> >
> > +/*
> > + * Debug flag that can be tested to see whether alternative
> > + * instructions were patched in already:
> > + */
> > +extern int alternatives_patched;
>
> Looks useful...
>
> > +
> > extern void alternative_instructions(void);
> > extern void apply_alternatives(struct alt_instr *start, struct alt_instr *end);
> >
> > diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
> > index 31a002ad5aeb..ab2c507b58b6 100644
> > --- a/arch/x86/include/asm/fpu/xstate.h
> > +++ b/arch/x86/include/asm/fpu/xstate.h
> > @@ -119,7 +119,7 @@ static inline int xsave_state(struct xsave_struct *fx)
> > u32 hmask = mask >> 32;
> > int err = 0;
> >
> > - WARN_ON(system_state == SYSTEM_BOOTING);
> > + WARN_ON(!alternatives_patched);
>
> Btw, shouldn't we be doing this check in what is called now
> copy_kernel_to_xregs() too, i.e., right under this function which is now
> called copy_xregs_to_kernel()?

Yeah, makes sense - and we can now use WARN_ON_FPU() as well to not
have the overhead when debugging is disabled.

Thanks,

Ingo

2015-05-06 03:35:19

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 206/208] x86/fpu: Add CONFIG_X86_DEBUG_FPU=y FPU debugging code


* Borislav Petkov <[email protected]> wrote:

> On Tue, May 05, 2015 at 07:58:30PM +0200, Ingo Molnar wrote:
> > There are various internal FPU state debugging checks that never
> > trigger in practice, but which are useful for FPU code development.
> >
> > Separate these out into CONFIG_X86_DEBUG_FPU=y, and also add a
> > couple of new ones.
> >
> > The size difference is about 0.5K of code on defconfig:
> >
> > text data bss filename
> > 15028906 2578816 1638400 vmlinux
> > 15029430 2578816 1638400 vmlinux
> >
> > ( Keep this enabled by default until the new FPU code is debugged. )
> >
> > Cc: Andy Lutomirski <[email protected]>
> > Cc: Borislav Petkov <[email protected]>
> > Cc: Dave Hansen <[email protected]>
> > Cc: Fenghua Yu <[email protected]>
> > Cc: H. Peter Anvin <[email protected]>
> > Cc: Linus Torvalds <[email protected]>
> > Cc: Oleg Nesterov <[email protected]>
> > Cc: Thomas Gleixner <[email protected]>
> > Signed-off-by: Ingo Molnar <[email protected]>
>
> ...
>
> > diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
> > index a4c1b7dbf70e..d2a281bd5f45 100644
> > --- a/arch/x86/include/asm/fpu/internal.h
> > +++ b/arch/x86/include/asm/fpu/internal.h
> > @@ -59,6 +59,15 @@ extern void fpu__clear(struct fpu *fpu);
> > extern void fpu__init_check_bugs(void);
> > extern void fpu__resume_cpu(void);
> >
> > +/*
> > + * Debugging facility:
> > + */
> > +#ifdef CONFIG_X86_DEBUG_FPU
> > +# define WARN_ON_FPU(x) WARN_ON_ONCE(x)
> > +#else
> > +# define WARN_ON_FPU(x) ({ 0; })
>
> Shouldn't this be called FPU_WARN_ON() ?

So I wanted this to match the 'usual' WARN*() APIs in appearance, with
only at the end a small signal that this is conditional on FPU
debugging enabled. In terms of code, we should think of them as
WARN_ON()s. Slapping FPU_ in front of them distracts from that IMHO.

No strong feelings though.

Thanks,

Ingo

2015-05-06 04:11:14

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 207/208] x86/fpu: Add FPU performance measurement subsystem


* Dave Hansen <[email protected]> wrote:

> On 05/05/2015 10:58 AM, Ingo Molnar wrote:
> > x86/fpu: Cost of: XSAVE insn : 104 cycles
> > x86/fpu: Cost of: XRSTOR insn : 80 cycles
>
> Isn't there going to be pretty huge variability here depending on
> how much state you are xsave/xrstor'ing and if the init/modified
> optimizations are in play?

Hopefully there's such variability! :)

I thought to add measurements for that as well:

- to see the costs of this instruction family when various xstate
components are in 'init state' or not

- maybe even measure whether it can optimize based on whether things
got changed since the last save (which the SDM kind of alludes to
but which I doubt the hw does)?

This initial version only measures trivial init state save/restore
cost.

Thanks,

Ingo

2015-05-06 04:13:44

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 200/208] x86/fpu/xstate: Don't assume the first zero xfeatures zero bit means the end


* Yu, Fenghua <[email protected]> wrote:

> > From: Ingo Molnar [mailto:[email protected]] On Behalf Of Ingo
> > Molnar
> > Sent: Tuesday, May 05, 2015 10:58 AM
> > To: [email protected]
> > Cc: Andy Lutomirski; Borislav Petkov; Dave Hansen; Yu, Fenghua; H. Peter
> > Anvin; Linus Torvalds; Oleg Nesterov; Thomas Gleixner
> > Subject: [PATCH 200/208] x86/fpu/xstate: Don't assume the first zero
> > xfeatures zero bit means the end
> >
> > The current xstate code in setup_xstate_features() assumes that the first
> > zero bit means the end of xfeatures - but that is not so, the SDM clearly
> > states that an arbitrary set of xfeatures might be enabled - and it is also clear
> > from the description of the compaction feature that holes are possible:
>
> A previous patch in lkml has (exactly) the same fix:
> http://lists-archives.com/linux-kernel/28292115-x86-xsave-c-fix-xstate-offsets-and-sizes-enumeration.html

Ok, and that series has some more bits as well - will merge.

Thanks,

Ingo

2015-05-06 04:20:36

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 198/208] x86/fpu: Document the various fpregs state formats


* Dave Hansen <[email protected]> wrote:

> On 05/05/2015 10:58 AM, Ingo Molnar wrote:
> > +/*
> > + * This is our most modern FPU state format, as saved by the XSAVE
> > + * and restored by the XRSTOR instructions.
> > + *
> > + * It consists of a legacy fxregs portion, an xstate header and
> > + * subsequent fixed size areas as defined by the xstate header.
> > + * Not all CPUs support all the extensions.
> > + */
> > struct xregs_state {
> > struct fxregs_state i387;
> > struct xstate_header header;
> > @@ -150,6 +169,13 @@ struct xregs_state {
> > /* New processor state extensions will go here. */
> > } __attribute__ ((packed, aligned (64)));
>
> Fenghua has a "fix" for this, but I think this misses a pretty big point.
>
> This structure includes only the "legacy" state, followed by the header.
> The remainder of the layout here is enumerated in CPUID leaves and can
> not be laid out in a structure because we do not know what it looks like
> until we run CPUID.
>
> There is logically a variable length array at the end of this
> sucker.

Yes, exactly, that is where we want to go, and this direction is what
I tried to cover with this bit of the series:

struct xregs_state {
struct fxregs_state i387;
struct xstate_header header;
u8 __reserved[XSTATE_RESERVE];
} __attribute__ ((packed, aligned (64)));

Note how it's now opaque after the xstate header, because there's no
guarantee of what's in that area.

The only 'fixed' aspect of the xstates is the feature bit enumeration:

enum xfeature_bit {
XSTATE_BIT_FP,
XSTATE_BIT_SSE,
XSTATE_BIT_YMM,
XSTATE_BIT_BNDREGS,
XSTATE_BIT_BNDCSR,
XSTATE_BIT_OPMASK,
XSTATE_BIT_ZMM_Hi256,
XSTATE_BIT_Hi16_ZMM,

XFEATURES_NR_MAX,

Plus with point #4 of the announcement I wanted to signal that I think
we should allocate the variable part dynamically:

4)

task->thread.fpu->state got embedded again, as
task->thread.fpu.state. This eliminated a lot of awkward late
dynamic memory allocation of FPU state and the problematic handling
of failures.

Note that while the allocation is static right now, this is a WIP
interim state: we can still do dynamic allocation of FPU state, by
moving the FPU state last in task_struct and then allocating
task_struct accordingly.

I.e. we can put the variable size state array at the end of
task_struct, make task_struct size per boot variable and still have
essentially a single static allocation for all fundamental task state.

But I first wanted to see people test this series - it's ambitious
enough as-is already!

Thanks,

Ingo

2015-05-06 04:52:46

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 207/208] x86/fpu: Add FPU performance measurement subsystem


* Andy Lutomirski <[email protected]> wrote:

> On May 5, 2015 11:30 PM, "Ingo Molnar" <[email protected]> wrote:
> >
> > Add a short FPU performance suite that runs once during bootup.
> >
> > It can be enabled via CONFIG_X86_DEBUG_FPU_PERFORMANCE=y.
>
> Neat!
>
> Can you change "cycles" to "TSC ticks"? They're not quite the same thing.

Yeah, with constant TSC we have the magic TSC frequency that is used
by RDTSC.

I'm torn: 'TSC ticks' will mean very little to most people reading
that output. We could convert it to nsecs with a little bit of
calibration - but that makes it depend on small differences in CPU
model frequencies, while the (cached) cycle costs are typically
constant per microarchitecture.

I suspect we could snatch a performance counter temporarily, to get
the real cycles count, and maybe even add a uops column. Most of this
needs to run in kernel space, so it's not a tooling project.

I also wanted to add cache-cold numbers which are very interesting as
well, just awfully hard to measure in a stable fashion. For cache-cold
numbers the natural unit would be memory bus cycles.

Thanks,

Ingo

2015-05-06 04:54:48

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 201/208] x86/fpu: Clean up xstate feature reservation


* Dave Hansen <[email protected]> wrote:

> On 05/05/2015 10:58 AM, Ingo Molnar wrote:
> > struct xregs_state {
> > struct fxregs_state i387;
> > struct xstate_header header;
> > - struct ymmh_struct ymmh;
> > - struct lwp_struct lwp;
> > - struct bndreg bndreg[4];
> > - struct bndcsr bndcsr;
> > - /* New processor state extensions will go here. */
> > + u8 __reserved[XSTATE_RESERVE];
> > } __attribute__ ((packed, aligned (64)));
>
> Just to reiterate. The size of 'XSTATE_RESERVE' is completely
> unknown at compile time. [...]

Yes, see my previous mail.

> [...] It's wrong in the existing kernel, but we should fix it up
> instead of mucking with it like this.

I'm not mucking with it, I'm slowly migrating it towards a dynamic
model that doesn't suck - see my other mails.

Thanks,

Ingo

2015-05-06 15:53:47

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 207/208] x86/fpu: Add FPU performance measurement subsystem

On Wed, May 06, 2015 at 06:52:39AM +0200, Ingo Molnar wrote:
> > Can you change "cycles" to "TSC ticks"? They're not quite the same thing.
>
> Yeah, with constant TSC we have the magic TSC frequency that is used
> by RDTSC.
>
> I'm torn: 'TSC ticks' will mean very little to most people reading
> that output. We could convert it to nsecs with a little bit of
> calibration - but that makes it depend on small differences in CPU
> model frequencies, while the (cached) cycle costs are typically
> constant per microarchitecture.

I think the best we should do is convert the TSC ticks to the unboosted
P0 frequency, i.e. if the P0 freq is say, 4GHz, we have 4*10^9 core
cycles per second. And then convert the counted TSC ticks to those
cycles.

For that we would need to measure in the beginning how TSC ticks relate
to P0 cycles and then use that number for conversion...

Hmmm.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2015-05-07 02:52:43

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 207/208] x86/fpu: Add FPU performance measurement subsystem

On May 6, 2015 10:22 AM, "Ingo Molnar" <[email protected]> wrote:
>
>
> * Andy Lutomirski <[email protected]> wrote:
>
> > On May 5, 2015 11:30 PM, "Ingo Molnar" <[email protected]> wrote:
> > >
> > > Add a short FPU performance suite that runs once during bootup.
> > >
> > > It can be enabled via CONFIG_X86_DEBUG_FPU_PERFORMANCE=y.
> >
> > Neat!
> >
> > Can you change "cycles" to "TSC ticks"? They're not quite the same thing.
>
> Yeah, with constant TSC we have the magic TSC frequency that is used
> by RDTSC.
>
> I'm torn: 'TSC ticks' will mean very little to most people reading
> that output. We could convert it to nsecs with a little bit of
> calibration - but that makes it depend on small differences in CPU
> model frequencies, while the (cached) cycle costs are typically
> constant per microarchitecture.

Isn't it dependent on the ratio of max turbo frequency to TSC freq?
Typical non-ultra-mobile systems should be at or near max turbo
frequency during bootup.

>
> I suspect we could snatch a performance counter temporarily, to get
> the real cycles count, and maybe even add a uops column. Most of this
> needs to run in kernel space, so it's not a tooling project.

This will suck under KVM without extra care. I know, because I'm
working on a similar userspace tool that uses RDPMC.

Another option would be rdmsr(MSR_IA32_APERF), but that isn't
available under KVM either.

>
> I also wanted to add cache-cold numbers which are very interesting as
> well, just awfully hard to measure in a stable fashion. For cache-cold
> numbers the natural unit would be memory bus cycles.

Yeah, maybe it's worth wiring up perf counters at some point.

--Andy

2015-05-12 17:46:50

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH 000/208] big x86 FPU code rewrite

Hey Ingo,

This throws a warning if I try to run one of my MPX programs:

> [ 22.907739] ------------[ cut here ]------------
> [ 22.907776] WARNING: CPU: 0 PID: 500 at /home/davehans/linux.git/arch/x86/kernel/fpu/core.c:324 fpu__activate_stopped+0x87/0x90()
> [ 22.907836] Modules linked in:
> [ 22.907859] CPU: 0 PID: 500 Comm: mpx-mini-test-v Not tainted 4.1.0-rc2-00208-ga9a0b36 #1181
> [ 22.907901] Hardware name: Intel Corporation
> [ 22.907958] ffffffff81c5e4d0 ffff880166bff988 ffffffff817fb4de 0000000000000000
> [ 22.908005] 0000000000000000 ffff880166bff9c8 ffffffff810a7dca ffff880166bff9d8
> [ 22.908049] ffff8800728d40c0 0000000000000200 0000000000000000 ffff88007b3fec00
> [ 22.908093] Call Trace:
> [ 22.908109] [<ffffffff817fb4de>] dump_stack+0x4c/0x65
> [ 22.908139] [<ffffffff810a7dca>] warn_slowpath_common+0x8a/0xc0
> [ 22.908169] [<ffffffff810a7eba>] warn_slowpath_null+0x1a/0x20
> [ 22.908201] [<ffffffff81061eb7>] fpu__activate_stopped+0x87/0x90
> [ 22.908230] [<ffffffff810623c5>] xfpregs_get+0x35/0xa0
> [ 22.908257] [<ffffffff8123535c>] elf_core_dump+0x56c/0x1600
> [ 22.908289] [<ffffffff817f95ed>] ? __slab_free+0x10a/0x212
> [ 22.908318] [<ffffffff8123e3eb>] do_coredump+0x78b/0xe90
> [ 22.908346] [<ffffffff810b2747>] ? __sigqueue_free.part.16+0x37/0x40
> [ 22.908379] [<ffffffff810b5c33>] get_signal+0x1d3/0x640
> [ 22.908407] [<ffffffff810b409e>] ? send_signal+0x3e/0x80
> [ 22.908434] [<ffffffff81056478>] do_signal+0x28/0x6c0
> [ 22.908464] [<ffffffff810d6ca9>] ? pick_next_entity+0xa9/0x190
> [ 22.908488] [<ffffffff810b4c45>] ? do_send_specific+0x85/0xa0
> [ 22.908521] [<ffffffff81056b88>] do_notify_resume+0x78/0x80
> [ 22.908549] [<ffffffff81803282>] int_signal+0x12/0x17
> [ 22.908575] ---[ end trace 80f689a9062056c5 ]---

This is booted with or without xsaves support.

BTW, I think you fixed the broken eagerfpu=no I was seeing on my hardware.

2015-05-12 21:58:55

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH 193/208] x86/fpu: Rename all the fpregs, xregs, fxregs and fregs handling functions

On 05/05/2015 10:58 AM, Ingo Molnar wrote:
> --- a/arch/x86/mm/mpx.c
> +++ b/arch/x86/mm/mpx.c
> @@ -389,7 +389,7 @@ int mpx_enable_management(struct task_struct *tsk)
> * directory into XSAVE/XRSTOR Save Area and enable MPX through
> * XRSTOR instruction.
> *
> - * xsave_state() is expected to be very expensive. Storing the bounds
> + * copy_xregs_to_kernel() is expected to be very expensive. Storing the bounds
> * directory here means that we do not have to do xsave in the unmap
> * path; we can just use mm->bd_addr instead.
> */

Nit: looks like there was a blind substitution done here. The comment
needs to get rewrapped.

2015-05-19 21:41:54

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 000/208] big x86 FPU code rewrite


* Ingo Molnar <[email protected]> wrote:

> [Second part of the series - Gmail didn't like me sending so many mails.]

Just a quick update: I merged these patches to -rc4 (sans the
benchmarking bits, which still need more work) and applied them to
tip:x86/fpu and have pushed them out, so that it gets more testing,
etc.

It's also tip:master, and if it stays problem-free I'll push it out
towards linux-next as well.

Please have a look.

Thanks,

Ingo

2015-05-27 01:22:48

by Bobby Powers

[permalink] [raw]
Subject: Re: [PATCH 000/208] big x86 FPU code rewrite

Hello,

Ingo Molnar <[email protected]> wrote:
> Please have a look.

I've been running this for ~ 2 weeks. I've only seen one issue, when
emerging mesa 10.5.6:

[May26 20:41] traps: aclocal-1.15[27452] trap invalid opcode
ip:7f6331031ab0 sp:7ffe73ece880 error:0 in
libperl.so.5.20.2[7f6330f18000+19e000]
[ +0.000051] ------------[ cut here ]------------
[ +0.000005] WARNING: CPU: 0 PID: 27452 at
arch/x86/kernel/fpu/core.c:324 fpu__activate_stopped+0x8a/0xa0()
[ +0.000002] Modules linked in: bnep iwlmvm btusb btintel bluetooth iwlwifi
[ +0.000007] CPU: 0 PID: 27452 Comm: aclocal-1.15 Not tainted 4.1.0-rc5+ #163
[ +0.000001] Hardware name: LENOVO 20BSCTO1WW/20BSCTO1WW, BIOS
N14ET24W (1.02 ) 10/27/2014
[ +0.000001] ffffffff82172735 ffff88017cccb998 ffffffff81c4f534
0000000080000000
[ +0.000002] 0000000000000000 ffff88017cccb9d8 ffffffff8112611a
ffff88017cccb9f8
[ +0.000002] ffff88018e352400 0000000000000000 0000000000000000
ffff8801ef813a00
[ +0.000002] Call Trace:
[ +0.000004] [<ffffffff81c4f534>] dump_stack+0x4f/0x7b
[ +0.000003] [<ffffffff8112611a>] warn_slowpath_common+0x8a/0xc0
[ +0.000003] [<ffffffff8112620a>] warn_slowpath_null+0x1a/0x20
[ +0.000002] [<ffffffff81059c9a>] fpu__activate_stopped+0x8a/0xa0
[ +0.000002] [<ffffffff8105a221>] xfpregs_get+0x31/0x90
[ +0.000001] [<ffffffff8105bcc9>] ? getreg+0xa9/0x130
[ +0.000003] [<ffffffff812ba121>] elf_core_dump+0x531/0x1490
[ +0.000003] [<ffffffff812c3671>] do_coredump+0xbd1/0xef0
[ +0.000004] [<ffffffff81150238>] ? try_to_wake_up+0x1f8/0x350
[ +0.000002] [<ffffffff81134a4c>] get_signal+0x38c/0x700
[ +0.000003] [<ffffffff8104dbb8>] do_signal+0x28/0x760
[ +0.000002] [<ffffffff8104e92d>] ? do_trap+0x6d/0x150
[ +0.000002] [<ffffffff8126246e>] ? vfs_read+0x11e/0x140
[ +0.000003] [<ffffffff8152f481>] ? trace_hardirqs_off_thunk+0x17/0x19
[ +0.000002] [<ffffffff8104e360>] do_notify_resume+0x70/0x80
[ +0.000002] [<ffffffff81c58d82>] retint_signal+0x42/0x80
[ +0.000002] ---[ end trace 8baea2e2110d6ca1 ]---

This trace is a bit off - the path to fpu__activate_stopped from
elf_core_dump looks like:

fpu__activate_stopped
xfgregs_get
fill_thread_core_info
fill_note_info
elf_core_dump

It looks like the WARN_ON_FPU there is just invalid? If we're
dumping, we have a valid case for curr == target.

I can reproduce this and I have the coredump, but I have no hope in
creating a test case out of this.

yours,
Bobby