Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751622AbdD0Ddu (ORCPT ); Wed, 26 Apr 2017 23:33:50 -0400 Received: from mga07.intel.com ([134.134.136.100]:7773 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751008AbdD0Dds (ORCPT ); Wed, 26 Apr 2017 23:33:48 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.37,257,1488873600"; d="scan'208";a="1161505070" Message-ID: <1493264026.36058.94.camel@ranerica-desktop> Subject: Re: [v6 PATCH 12/21] x86/insn: Support both signed 32-bit and 64-bit effective addresses From: Ricardo Neri To: Borislav Petkov Cc: Ingo Molnar , Thomas Gleixner , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Andrew Morton , Brian Gerst , Chris Metcalf , Dave Hansen , Paolo Bonzini , Masami Hiramatsu , Huang Rui , Jiri Slaby , Jonathan Corbet , "Michael S. Tsirkin" , Paul Gortmaker , Vlastimil Babka , Chen Yucong , Alexandre Julliard , Stas Sergeev , Fenghua Yu , "Ravi V. Shankar" , Shuah Khan , linux-kernel@vger.kernel.org, x86@kernel.org, linux-msdos@vger.kernel.org, wine-devel@winehq.org, Adam Buchbinder , Colin Ian King , Lorenzo Stoakes , Qiaowei Ren , Arnaldo Carvalho de Melo , Adrian Hunter , Kees Cook , Thomas Garnier , Dmitry Vyukov Date: Wed, 26 Apr 2017 20:33:46 -0700 In-Reply-To: <20170425135150.5sk4kwaw2qrsxre3@pd.tnic> References: <20170308003254.27833-1-ricardo.neri-calderon@linux.intel.com> <20170308003254.27833-13-ricardo.neri-calderon@linux.intel.com> <20170425135150.5sk4kwaw2qrsxre3@pd.tnic> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.10.4-0ubuntu2 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6487 Lines: 183 On Tue, 2017-04-25 at 15:51 +0200, Borislav Petkov wrote: > On Tue, Mar 07, 2017 at 04:32:45PM -0800, Ricardo Neri wrote: > > The 32-bit and 64-bit address encodings are identical. This means that we > > can use the same function in both cases. In order to reuse the function for > > 32-bit address encodings, we must sign-extend our 32-bit signed operands to > > 64-bit signed variables (only for 64-bit builds). To decide on whether sign > > extension is needed, we rely on the address size as given by the > > instruction structure. > > > > Lastly, before computing the linear address, we must truncate our signed > > 64-bit signed effective address if the address size is 32-bit. > > > > Cc: Dave Hansen > > Cc: Adam Buchbinder > > Cc: Colin Ian King > > Cc: Lorenzo Stoakes > > Cc: Qiaowei Ren > > Cc: Arnaldo Carvalho de Melo > > Cc: Masami Hiramatsu > > Cc: Adrian Hunter > > Cc: Kees Cook > > Cc: Thomas Garnier > > Cc: Peter Zijlstra > > Cc: Borislav Petkov > > Cc: Dmitry Vyukov > > Cc: Ravi V. Shankar > > Cc: x86@kernel.org > > Signed-off-by: Ricardo Neri > > --- > > arch/x86/lib/insn-eval.c | 44 ++++++++++++++++++++++++++++++++------------ > > 1 file changed, 32 insertions(+), 12 deletions(-) > > > > diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c > > index edb360f..a9a1704 100644 > > --- a/arch/x86/lib/insn-eval.c > > +++ b/arch/x86/lib/insn-eval.c > > @@ -559,6 +559,15 @@ int insn_get_reg_offset_sib_index(struct insn *insn, struct pt_regs *regs) > > return get_reg_offset(insn, regs, REG_TYPE_INDEX); > > } > > > > +static inline long __to_signed_long(unsigned long val, int long_bytes) > > +{ > > +#ifdef CONFIG_X86_64 > > + return long_bytes == 4 ? (long)((int)((val) & 0xffffffff)) : (long)val; > > I don't think this always works as expected: > > --- > typedef unsigned int u32; > typedef unsigned long u64; > > int main() > { > u64 v = 0x1ffffffff; > > printf("v: %ld, 0x%lx, %ld\n", v, v, (long)((int)((v) & 0xffffffff))); > > return 0; > } > -- > ... > > v: 8589934591, 0x1ffffffff, -1 > > Now, this should not happen on 32-bit because unsigned long is 32-bit > there but can that happen on 64-bit? This is the reason I check the value of long_bytes. If long_bytes is not 4, being the only other possible value 8 (perhaps I need to issue an error when the value is not any of these values), the cast is simply (long)val. I modified your test program with: printf("v: %ld, 0x%lx, %ld, %ld\n", v, v, (long)((int)((v) & 0xffffffff)), (long)v); and I get: v: 8589934591, 0x1ffffffff, -1, 8589934591. Am I missing something? > > > +#else > > + return (long)val; > > +#endif > > +} > > + > > /* > > * return the address being referenced be instruction > > * for rm=3 returning the content of the rm reg > > @@ -567,19 +576,21 @@ int insn_get_reg_offset_sib_index(struct insn *insn, struct pt_regs *regs) > > void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs) > > { > > unsigned long linear_addr, seg_base_addr; > > - long eff_addr, base, indx; > > - int addr_offset, base_offset, indx_offset; > > + long eff_addr, base, indx, tmp; > > + int addr_offset, base_offset, indx_offset, addr_bytes; > > insn_byte_t sib; > > > > insn_get_modrm(insn); > > insn_get_sib(insn); > > sib = insn->sib.value; > > + addr_bytes = insn->addr_bytes; > > > > if (X86_MODRM_MOD(insn->modrm.value) == 3) { > > addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM); > > if (addr_offset < 0) > > goto out_err; > > - eff_addr = regs_get_register(regs, addr_offset); > > + tmp = regs_get_register(regs, addr_offset); > > + eff_addr = __to_signed_long(tmp, addr_bytes); > > This repeats throughout the function so it begs to be a separate: > > get_mem_addr() > > or so. Yes, the same pattern is used in all places except when using register operands (ModRM.rm == 11b). I will look into putting it in a function. > > > seg_base_addr = insn_get_seg_base(regs, insn, addr_offset, > > false); > > } else { > > @@ -591,20 +602,24 @@ void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs) > > * in the address computation. > > */ > > base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE); > > - if (unlikely(base_offset == -EDOM)) > > + if (unlikely(base_offset == -EDOM)) { > > base = 0; > > - else if (unlikely(base_offset < 0)) > > + } else if (unlikely(base_offset < 0)) { > > goto out_err; > > - else > > - base = regs_get_register(regs, base_offset); > > + } else { > > + tmp = regs_get_register(regs, base_offset); > > + base = __to_signed_long(tmp, addr_bytes); > > + } > > > > indx_offset = get_reg_offset(insn, regs, REG_TYPE_INDEX); > > - if (unlikely(indx_offset == -EDOM)) > > + if (unlikely(indx_offset == -EDOM)) { > > indx = 0; > > - else if (unlikely(indx_offset < 0)) > > + } else if (unlikely(indx_offset < 0)) { > > goto out_err; > > - else > > - indx = regs_get_register(regs, indx_offset); > > + } else { > > + tmp = regs_get_register(regs, indx_offset); > > + indx = __to_signed_long(tmp, addr_bytes); > > + } > > > > eff_addr = base + indx * (1 << X86_SIB_SCALE(sib)); > > seg_base_addr = insn_get_seg_base(regs, insn, > > @@ -625,13 +640,18 @@ void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs) > > } else if (addr_offset < 0) { > > goto out_err; > > } else { > > - eff_addr = regs_get_register(regs, addr_offset); > > + tmp = regs_get_register(regs, addr_offset); > > + eff_addr = __to_signed_long(tmp, addr_bytes); > > } > > seg_base_addr = insn_get_seg_base(regs, insn, > > addr_offset, false); > > } > > eff_addr += insn->displacement.value; > > } > > + /* truncate to 4 bytes for 32-bit effective addresses */ > > + if (addr_bytes == 4) > > + eff_addr &= 0xffffffff; > > Why again? eff_addr is a long variable, which in x86_64 has 64-bit. However, in 32-bit segments the effective address is 32-bit. Thus, I discard the 32 most significant bytes. Thanks and BR, Ricardo