Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp89062pxk; Tue, 1 Sep 2020 17:10:58 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzT52NXQF9tJvPKAex4TAKjY7olvAPUhYw0n9dMc36e642AOHpGyW6ReS/KuVo6QJFHiypd X-Received: by 2002:a50:fb15:: with SMTP id d21mr4301537edq.150.1599005457825; Tue, 01 Sep 2020 17:10:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1599005457; cv=none; d=google.com; s=arc-20160816; b=Czk7IWAWPKORtY2GjvwgftpaBwVyE+5L3bcJuLD24J4VBsAqWwEPHK/vMYePfIGif9 lO3wSH0CL+1y0ZNLsgF/v2iUy7ZgQjvJGU6AQRw8GM32od6nG+AjKzwjMY1DOwOnS14G +JvSuAakmr5b8G8daBVkpq61q3fjJrJcaAcJA5HlfX8bpXapm8vjqv0JzaalnvY5Dgg6 NmOWacbVeFdCRlQiN9z3aAX2trZ0Kw6TBF0JgTGemYAgCyzohHsDbaVy/cBdjzuv5/eO Z978ySaCQn79Z+6CMXFuMn5nh8vGXYcgEywL+xymnabaQziNFMfO47VHn0XgWDZHtSDQ Hs/w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=Mc06GkZ8XkoosP2bgyWXNn63xGyowg4OJQgQ62SA25A=; b=h/1h88zMcuqz6cOBCzYZ1lUkQ605TytpDgmH6PUsz8VVs8YqxAB/ED7b472AiOJGlO aTr9JSu7SOoTNBWL/NCq3MtGiTbH5RdxwVlJVM1HEATCJLOOGUmmGrrHyMPB2bm+63GI Ae4dZcoXXRC6PRuNRuCOkl49GBBQWTzan5rkkkE2RU9pUW2KeY7L1G8GApYR1CCcF0f3 bNr43Lg+otYeaYT0sCNgDYXj2bnjU64fwoh2dntW/+VFZwjqnvmz+fm9vQeFqinIUVZh Nyt55FRn79G//FGwWb4aGxBiZuGxJzbdFuzISueGao/EfAHpurBRDWd6Rqw78B9DTJrn rG9w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=TMu2S3Iz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id lo4si1572370ejb.190.2020.09.01.17.10.33; Tue, 01 Sep 2020 17:10:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=TMu2S3Iz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726503AbgIBAJx (ORCPT + 99 others); Tue, 1 Sep 2020 20:09:53 -0400 Received: from mail.kernel.org ([198.145.29.99]:37344 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726167AbgIBAJu (ORCPT ); Tue, 1 Sep 2020 20:09:50 -0400 Received: from mail-wm1-f42.google.com (mail-wm1-f42.google.com [209.85.128.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 5AD2020C56 for ; Wed, 2 Sep 2020 00:09:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599005388; bh=ue8KFmWAURt4lhvOfblC/6hWxuPhGkaF5B0T2fLxa7I=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=TMu2S3IzHfz+o6ohiqDLBmmcmE/E3rINy8vvAXGieza4ImIhAEf776UnghnDuHvzd 5RUSUwl4gq1gDaK4mi1DBXXH+ZDeKELWu02LXQKCNxeqTYN929b+1mZTWtyLPaPna1 +yVVQs19uJk+DPC0V7/4RRNdHi/w9iCHs5rjuS/c= Received: by mail-wm1-f42.google.com with SMTP id v4so2767089wmj.5 for ; Tue, 01 Sep 2020 17:09:48 -0700 (PDT) X-Gm-Message-State: AOAM531Y70Z0hh4+kK4oisbmPW8Qd713xH/Gp3RACpGDUL0lLifqSRJe pR1i6+9BKzOD9v3PIZtVpeikTzDoSM7lDR6HJ+OqRA== X-Received: by 2002:a1c:7e02:: with SMTP id z2mr4078821wmc.138.1599005386776; Tue, 01 Sep 2020 17:09:46 -0700 (PDT) MIME-Version: 1.0 References: <87k0xdjbtt.fsf@nanos.tec.linutronix.de> In-Reply-To: <87k0xdjbtt.fsf@nanos.tec.linutronix.de> From: Andy Lutomirski Date: Tue, 1 Sep 2020 17:09:35 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: ptrace_syscall_32 is failing To: Thomas Gleixner Cc: Andy Lutomirski , Brian Gerst , X86 ML , LKML , Will Deacon , Christian Borntraeger , Michael Ellerman , Benjamin Herrenschmidt , Paul Mackerras , Catalin Marinas , linux-arm-kernel , Heiko Carstens , Vasily Gorbik , linux-s390 , linuxppc-dev Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Sep 1, 2020 at 4:50 PM Thomas Gleixner wrote: > > On Sun, Aug 30 2020 at 08:52, Andy Lutomirski wrote: > >> > [RUN] SYSCALL > >> > [FAIL] Initial args are wrong (nr=29, args=0 0 0 0 0 4289172732) > >> > [RUN] SYSCALL > >> > [OK] Args after SIGUSR1 are correct (ax = -514) > >> > [OK] Child got SIGUSR1 > >> > [RUN] Step again > >> > [OK] pause(2) restarted correctly > >> > >> Bisected to commit 0b085e68f407 ("x86/entry: Consolidate 32/64 bit > >> syscall entry"). > >> It looks like it is because syscall_enter_from_user_mode() is called > >> before reading the 6th argument from the user stack. > > Bah.I don't know how I managed to miss that part and interestingly > enough that none of the robots caught that either > > > Thomas, can we revert the syscall_enter() and syscall_exit() part of > > the series? > > Hrm. > > > I think that they almost work for x86, but not quite as > > indicated by this bug. Even if we imagine we can somehow hack around > > this bug, I imagine we're going to find other problems with this > > model, e.g. the potential upcoming exit problem I noted in my review. > > What's the upcoming problem? If we ever want to get single-stepping fully correct across syscalls, we might need to inject SIGTRAP on syscall return. This would be more awkward if we can't run instrumentable code after the syscall part of the syscall is done. > > > I really think the model should be: > > > > void do_syscall_whatever(...) > > { > > irqentry_enter(...); > > instrumentation_begin(); > > > > /* Do whatever arch ABI oddities are needed on entry. */ > > > > Then either: > > syscall_begin(arch, nr, regs); > > dispatch the syscall; > > syscall_end(arch, nr, regs); > > > > Or just: > > generic_do_syscall(arch, nr, regs); > > > > /* Do whatever arch ABI oddities are needed on exit from the syscall. */ > > > > instrumentation_end(); > > irqentry_exit(...); > > } > > I don't think we want that in general. The current variant is perfectly > fine for everything except the 32bit fast syscall nonsense. Also > irqentry_entry/exit is not equivalent to the syscall_enter/exit > counterparts. If there are any architectures in which actual work is needed to figure out whether something is a syscall in the first place, they'll want to do the usual kernel entry work before the syscall entry work. Maybe your patch actually makes this possible -- I haven't digested all the details yet. Who advised you to drop the arch parameter? > --- > arch/x86/entry/common.c | 29 ++++++++++++++++-------- > include/linux/entry-common.h | 51 +++++++++++++++++++++++++++++++++++-------- > kernel/entry/common.c | 35 ++++++++++++++++++++++++----- > 3 files changed, 91 insertions(+), 24 deletions(-) > > --- a/arch/x86/entry/common.c > +++ b/arch/x86/entry/common.c > @@ -60,16 +60,10 @@ > #if defined(CONFIG_X86_32) || defined(CONFIG_IA32_EMULATION) > static __always_inline unsigned int syscall_32_enter(struct pt_regs *regs) > { > - unsigned int nr = (unsigned int)regs->orig_ax; > - > if (IS_ENABLED(CONFIG_IA32_EMULATION)) > current_thread_info()->status |= TS_COMPAT; > - /* > - * Subtlety here: if ptrace pokes something larger than 2^32-1 into > - * orig_ax, the unsigned int return value truncates it. This may > - * or may not be necessary, but it matches the old asm behavior. > - */ > - return (unsigned int)syscall_enter_from_user_mode(regs, nr); > + > + return (unsigned int)regs->orig_ax; > } > > /* > @@ -91,15 +85,29 @@ static __always_inline void do_syscall_3 > { > unsigned int nr = syscall_32_enter(regs); > > + /* > + * Subtlety here: if ptrace pokes something larger than 2^32-1 into > + * orig_ax, the unsigned int return value truncates it. This may > + * or may not be necessary, but it matches the old asm behavior. > + */ > + nr = (unsigned int)syscall_enter_from_user_mode(regs, nr); > + > do_syscall_32_irqs_on(regs, nr); > syscall_exit_to_user_mode(regs); > } > > static noinstr bool __do_fast_syscall_32(struct pt_regs *regs) > { > - unsigned int nr = syscall_32_enter(regs); > + unsigned int nr = syscall_32_enter(regs); > int res; > > + /* > + * This cannot use syscall_enter_from_user_mode() as it has to > + * fetch EBP before invoking any of the syscall entry work > + * functions. > + */ > + syscall_enter_from_user_mode_prepare(regs); I'm getting lost in all these "enter" functions...