Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754903AbbHNLU1 (ORCPT ); Fri, 14 Aug 2015 07:20:27 -0400 Received: from mx1.redhat.com ([209.132.183.28]:56648 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753903AbbHNLU0 (ORCPT ); Fri, 14 Aug 2015 07:20:26 -0400 Message-ID: <55CDCEF4.1090906@redhat.com> Date: Fri, 14 Aug 2015 13:20:20 +0200 From: Denys Vlasenko User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: Linus Torvalds CC: Kees Cook , David Drysdale , Andy Lutomirski , "linux-kernel@vger.kernel.org" , Will Drewry , Ingo Molnar , Alok Kataria , Borislav Petkov , Alexei Starovoitov , Frederic Weisbecker , "H. Peter Anvin" , Oleg Nesterov , Steven Rostedt , X86 ML Subject: Re: [Regression v4.2 ?] 32-bit seccomp-BPF returned errno values wrong in VM? References: <55CCB510.3060807@redhat.com> <55CD0DAC.9080809@redhat.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2764 Lines: 72 On 08/14/2015 12:27 AM, Linus Torvalds wrote: > On Thu, Aug 13, 2015 at 2:35 PM, Denys Vlasenko wrote: >> >> I suspect this change: >> >> .macro auditsys_entry_common >> ... >> movl %ebx,%esi /* 2nd arg: 1st syscall arg */ >> movl %eax,%edi /* 1st arg: syscall number */ >> call __audit_syscall_entry >> - movl RAX(%rsp),%eax /* reload syscall number */ >> - cmpq $(IA32_NR_syscalls-1),%rax >> - ja ia32_badsys >> + movl ORIG_RAX(%rsp),%eax /* reload syscall number */ >> >> We were reloading syscall# from pt_regs->ax. >> >> After the patch, pt_regs->ax isn't equal to syscall# on entry, >> instead it contains -ENOSYS. Therefore the change shown above >> was made, to reload it from pt_regs->orig_ax. >> >> Well. This still should work... in fact it is "more correct" >> than it was before... > > Well, since it doesn't work, that's clearly not the case. > > Also, you do realize that ORIG_RAX can get changed by signal handling > and ptrace? I am very aware of that, yes. If it changes, we should use *it*. That's why new code in this part is "more correct" than old one. > In fact, I think that whole "save -ENOSYS to pt_regs->ax" is BS. > Exactly because we use pt_regs->ax for ptrace etc, and you've changed > the register state we expose. ptrace always sees pt_regs->ax = -ENOSYS on syscall entry. That's part of the ABI. Syscall# is in pt_regs->orig_ax. We used to do that _only_ on ptrace code path, the fast path didn't store -ENOSYS in pt_regs->ax. This optimization ended up being more pain than gain, and it was changed for 64-bit code by this commit: commit 54eea9957f5763dd1a2555d7e4cb53b4dd389cc6 Author: Andy Lutomirski Date: Fri Sep 5 15:13:55 2014 -0700 x86_64, entry: Treat regs->ax the same in fastpath and slowpath syscalls I changed 32-bit compat code to do the same thing. > I'm also going to be a *lot* less inclined to take all these idiotic > low-level x86 changes from now on. It's been a total pain, for very > little gain. These "cleanups" have been buggy as hell, and test > coverage for the compat case is clearly lacking. The code in question was an unholy mess, with about a half dozen "clever optimizations" tangled together. Now it is much, much less ugly. It was nearly inevitable that something would break during untangling. I understand the frustration of having things breaking "because of the stupid cleanup". -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/