Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932570AbdDRJmq (ORCPT ); Tue, 18 Apr 2017 05:42:46 -0400 Received: from mx2.suse.de ([195.135.220.15]:53711 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S932434AbdDRJml (ORCPT ); Tue, 18 Apr 2017 05:42:41 -0400 Date: Tue, 18 Apr 2017 11:42:21 +0200 From: Borislav Petkov To: Ricardo Neri Cc: Ingo Molnar , Thomas Gleixner , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Andrew Morton , Brian Gerst , Chris Metcalf , Dave Hansen , Paolo Bonzini , Masami Hiramatsu , Huang Rui , Jiri Slaby , Jonathan Corbet , "Michael S. Tsirkin" , Paul Gortmaker , Vlastimil Babka , Chen Yucong , Alexandre Julliard , Stas Sergeev , Fenghua Yu , "Ravi V. Shankar" , Shuah Khan , linux-kernel@vger.kernel.org, x86@kernel.org, linux-msdos@vger.kernel.org, wine-devel@winehq.org, Adam Buchbinder , Colin Ian King , Lorenzo Stoakes , Qiaowei Ren , Arnaldo Carvalho de Melo , Adrian Hunter , Kees Cook , Thomas Garnier , Dmitry Vyukov Subject: Re: [v6 PATCH 06/21] x86/insn-eval: Add utility functions to get segment selector Message-ID: <20170418094221.zamus5butw6yrfky@pd.tnic> References: <20170308003254.27833-1-ricardo.neri-calderon@linux.intel.com> <20170308003254.27833-7-ricardo.neri-calderon@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20170308003254.27833-7-ricardo.neri-calderon@linux.intel.com> User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 8453 Lines: 242 On Tue, Mar 07, 2017 at 04:32:39PM -0800, Ricardo Neri wrote: > When computing a linear address and segmentation is used, we need to know > the base address of the segment involved in the computation. In most of > the cases, the segment base address will be zero as in USER_DS/USER32_DS. > However, it may be possible that a user space program defines its own > segments via a local descriptor table. In such a case, the segment base > address may not be zero .Thus, the segment base address is needed to > calculate correctly the linear address. > > The segment selector to be used when computing a linear address is > determined by either any of segment select override prefixes in the > instruction or inferred from the registers involved in the computation of > the effective address; in that order. Also, there are cases when the > overrides shall be ignored. > > For clarity, this process can be split into two steps: resolving the > relevant segment and, once known, read the applicable segment selector. > The method to obtain the segment selector depends on several factors. In > 32-bit builds, segment selectors are saved into the pt_regs structure > when switching to kernel mode. The same is also true for virtual-8086 > mode. In 64-bit builds, segmentation is mostly ignored, except when > running a program in 32-bit legacy mode. In this case, CS and SS can be > obtained from pt_regs. DS, ES, FS and GS can be read directly from > registers. > Lastly, segmentation is possible in 64-bit mode via FS and GS. I'd say "Lastly, the only two segment registers which are not ignored in long mode are FS and GS." > In these two cases, base addresses are obtained from the relevant MSRs. s/relevant/respective/ > Cc: Dave Hansen > Cc: Adam Buchbinder > Cc: Colin Ian King > Cc: Lorenzo Stoakes > Cc: Qiaowei Ren > Cc: Arnaldo Carvalho de Melo > Cc: Masami Hiramatsu > Cc: Adrian Hunter > Cc: Kees Cook > Cc: Thomas Garnier > Cc: Peter Zijlstra > Cc: Borislav Petkov > Cc: Dmitry Vyukov > Cc: Ravi V. Shankar > Cc: x86@kernel.org > Signed-off-by: Ricardo Neri > --- > arch/x86/lib/insn-eval.c | 195 +++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 195 insertions(+) > > diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c > index 78df1c9..8d45df8 100644 > --- a/arch/x86/lib/insn-eval.c > +++ b/arch/x86/lib/insn-eval.c > @@ -8,6 +8,7 @@ > #include > #include > #include > +#include > > enum reg_type { > REG_TYPE_RM = 0, > @@ -15,6 +16,200 @@ enum reg_type { > REG_TYPE_BASE, > }; > > +enum segment { > + SEG_CS = 0x23, > + SEG_SS = 0x36, > + SEG_DS = 0x3e, > + SEG_ES = 0x26, > + SEG_FS = 0x64, > + SEG_GS = 0x65 > +}; > + > +/** > + * resolve_seg_selector() - obtain segment selector > + * @regs: Set of registers containing the segment selector That arg is gone. > + * @insn: Instruction structure with selector override prefixes > + * @regoff: Operand offset, in pt_regs, of which the selector is needed > + * @default: Resolve default segment selector (i.e., ignore overrides) > + * > + * The segment selector to which an effective address refers depends on > + * a) segment selector overrides instruction prefixes or b) the operand > + * register indicated in the ModRM or SiB byte. > + * > + * For case a), the function inspects any prefixes in the insn instruction; s/insn // > + * insn can be null to indicate that selector override prefixes shall be > + * ignored. This is not what the code does: it returns -EINVAL when insn is NULL. > This is useful when the use of prefixes is forbidden (e.g., > + * obtaining the code selector). For case b), the operand register shall be > + * represented as the offset from the base address of pt_regs. Also, regoff > + * can be -EINVAL for cases in which registers are not used as operands (e.g., > + * when the mod and r/m parts of the ModRM byte are 0 and 5, respectively). > + * > + * This function returns the segment selector to utilize as per the conditions > + * described above. Please note that this functin does not return the value > + * of the segment selector. The value of the segment selector needs to be > + * obtained using get_segment_selector and passing the segment selector type > + * resolved by this function. > + * > + * Return: Segment selector to use, among CS, SS, DS, ES, FS or GS. : negative value when... > + */ > +static int resolve_seg_selector(struct insn *insn, int regoff, bool get_default) > +{ > + int i; > + > + if (!insn) > + return -EINVAL; > + > + if (get_default) > + goto default_seg; > + /* > + * Check first if we have selector overrides. Having more than > + * one selector override leads to undefined behavior. We > + * only use the first one and return Well, I'd return -EINVAL to catch that undefined behavior. Note in a local var that I've already seen a seg reg and then if I see another one, return -EINVAL. > + */ > + for (i = 0; i < insn->prefixes.nbytes; i++) { > + switch (insn->prefixes.bytes[i]) { > + case SEG_CS: > + return SEG_CS; > + case SEG_SS: > + return SEG_SS; > + case SEG_DS: > + return SEG_DS; > + case SEG_ES: > + return SEG_ES; > + case SEG_FS: > + return SEG_FS; > + case SEG_GS: > + return SEG_GS; So what happens if you're in 64-bit mode and you have CS, DS, ES, or SS? Or is this what @get_default is supposed to do? But it doesn't look like it, it still returns segments ignored in 64-bit mode. > + default: > + return -EINVAL; > + } > + } > + > +default_seg: > + /* > + * If no overrides, use default selectors as described in the > + * Intel documentation: SS for ESP or EBP. DS for all data references, > + * except when relative to stack or string destination. > + * Also, AX, CX and DX are not valid register operands in 16-bit > + * address encodings. > + * Callers must interpret the result correctly according to the type > + * of instructions (e.g., use ES for string instructions). > + * Also, some values of modrm and sib might seem to indicate the use > + * of EBP and ESP (e.g., modrm_mod = 0, modrm_rm = 5) but actually > + * they refer to cases in which only a displacement used. These cases > + * should be indentified by the caller and not with this function. > + */ > + switch (regoff) { > + case offsetof(struct pt_regs, ax): > + /* fall through */ > + case offsetof(struct pt_regs, cx): > + /* fall through */ > + case offsetof(struct pt_regs, dx): > + if (insn && insn->addr_bytes == 2) > + return -EINVAL; > + case -EDOM: /* no register involved in address computation */ > + case offsetof(struct pt_regs, bx): > + /* fall through */ > + case offsetof(struct pt_regs, di): > + /* fall through */ return SEG_ES; ? It is even in the comment above. I'm looking at MOVS %es:%rdi, %ds:%rsi, for example. > + case offsetof(struct pt_regs, si): > + return SEG_DS; > + case offsetof(struct pt_regs, bp): > + /* fall through */ > + case offsetof(struct pt_regs, sp): > + return SEG_SS; > + case offsetof(struct pt_regs, ip): > + return SEG_CS; > + default: > + return -EINVAL; > + } > +} > + > +/** > + * get_segment_selector() - obtain segment selector > + * @regs: Set of registers containing the segment selector > + * @seg_type: Type of segment selector to obtain > + * @regoff: Operand offset, in pt_regs, of which the selector is needed That's gone. > + * > + * Obtain the segment selector for any of CS, SS, DS, ES, FS, GS. In > + * CONFIG_X86_32, the segment is obtained from either pt_regs or > + * kernel_vm86_regs as applicable. In CONFIG_X86_64, CS and SS are obtained > + * from pt_regs. DS, ES, FS and GS are obtained by reading the ds and es, fs > + * and gs, respectively. ... and DS and ES are ignored in long mode. > + * > + * Return: Value of the segment selector ... or negative... > + */ > +static unsigned short get_segment_selector(struct pt_regs *regs, > + enum segment seg_type) > +{ -- Regards/Gruss, Boris. SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) --