Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753548AbcKUVVi (ORCPT ); Mon, 21 Nov 2016 16:21:38 -0500 Received: from mail-oi0-f65.google.com ([209.85.218.65]:35118 "EHLO mail-oi0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752109AbcKUVVg (ORCPT ); Mon, 21 Nov 2016 16:21:36 -0500 MIME-Version: 1.0 In-Reply-To: <5bc7c7b2-875e-6366-9244-7dc6e2fae5c1@zytor.com> References: <20161121071342.GA16999@gmail.com> <5bc7c7b2-875e-6366-9244-7dc6e2fae5c1@zytor.com> From: Linus Torvalds Date: Mon, 21 Nov 2016 13:21:35 -0800 X-Google-Sender-Auth: C7F1S3jksA_Kvh8PIf4SfQSSAwQ Message-ID: Subject: Re: What exactly do 32-bit x86 exceptions push on the stack in the CS slot? To: "H. Peter Anvin" Cc: Ingo Molnar , Andy Lutomirski , Brian Gerst , Andy Lutomirski , tedheadster@gmail.com, George Spelvin , "linux-kernel@vger.kernel.org" , X86 ML Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2068 Lines: 61 On Mon, Nov 21, 2016 at 10:26 AM, H. Peter Anvin wrote: > On 11/21/16 10:00, Linus Torvalds wrote: >> >> I'd much rather we go back to just making the "cs" entry explicitly >> 16-bit, and have a separate padding entry, the way we used to long >> long ago. >> > > I would agree 100% with this. We _used_ to do it like this in some places (signal stack, other places): unsigned short cs, __csh; and int xcs; in others (pt_regs, for example). You still see that "xcs" thing in the x86 uapi/asm/ptrace.h file, but that's what our native pt_regs used to look like). And we still have that "cs+__cs" thing in at least 'struct user_regs_struct32'. But our "struct pt_regs" gas lost it. I wonder why we broke that. I suspect it happened when we merged the 64-bit and 32-bit files, but I was too lazy to try to pinpoint it. And I do think the original i386 model was better - exactly because it didn't access undefined state when you just accessed "cs". Either you had to know about it and it wasn't called 'cs' ("xcs") or you had that high/low separation. Of course, what might be better yet is to use an anonymous union, so that you can do both of the above for all the cases (ie access it both as a trustworthy low 16 bits, _and_ as a single 32-bit piece of information). We use anonymous unions all over now, we used to not do it because of compiler limitations. With an anonymous union, we could do soemthing like union { unsigned int xcs; unsigned short cs; } and so easily access either the reliable part (cs) or the full word (xcs) without masking or having to play games. [ In fact, I think we could try to make the "cs" member in that union be marked "const", which should mean that we'd get warnings if somebody were to try to assign just the half-word (so you'd always have to *assign* to "xcs", but you'd be able to read "cs"). I think that has made it from C++ to C. I'm not sure that's somethign we can/should use, but it sounds potentially useful for these kinds of cases ] Linus