Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp460663pxu; Tue, 1 Dec 2020 16:08:53 -0800 (PST) X-Google-Smtp-Source: ABdhPJznc/TCteqXj4zvMo1UgSfNbH+7ddgzbEXJE/WeMAtN5JN3rK04HFWyr990mZIHNo2IKSDB X-Received: by 2002:a17:906:3ac2:: with SMTP id z2mr5379458ejd.26.1606867733098; Tue, 01 Dec 2020 16:08:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1606867733; cv=none; d=google.com; s=arc-20160816; b=o7ibTYmW/p7eYwi+Ce8pDuWkuprgDCAHvlkd1Wb+SWGBCkc8pl2RTSJuoJgBXC1EQd unTD9/LrwcQYqef6VKysx/DoxxuALHQ49FuomM/rIcLDaxweiVIBkVe6LfIZJjH7ouJB ZVbun1IlX2NUCBommFtR8C4ecEBiueou82DjKv2a0HyHaJ9Vx3oU8jbKqacY//BqpldN Rlm2AUzPC/KyDg0H9/C7XvbAuNTMHUKUnovayXBr4IMiVEJQSeRm81P0SmLrY5hBxvut drzKPyncSlT4ExoaDUHDTte2hrVNEHp6aEemI1r2Iu1oguXX+3Kfa45891fpFvXFIymw IiKQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=y8U79NM+Q78dKrSjM+ztCUzN5Cvl3pzHKq+NbrI1Pbg=; b=yNi5o6tTt7qiGQeInBkbXTDfCOT/JuyoWHKKCsKcNf8Hbh9M1+/XNM/ljRO6EP04pj 0yZrrxEPFsAqRd3nXpbUJsusmUWVEns94cybzznWhuHGxdkd5Ao0lJgUh5U3WhVIGZRM dQtzqrWsche1noumWbDV037bzliMSdKpPpjaq4XskTX7Zger1yYsVaM9dPbw8ra2efLE tL40jXfs7OshJXJAKlaqvOIRnk+VflXXoxKSZpiNje4hVD0D36XNaADFQgdJ3LvidbkY 6pVv1grQKUKxGPTSyUVqbC6sb+kARJVdMXk/6F4/tu1v6fBl8ll/z2G/UfrtnW7jzlTX lH9A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=l7EZ3AnQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o14si967239ejm.610.2020.12.01.16.08.17; Tue, 01 Dec 2020 16:08:53 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=l7EZ3AnQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727099AbgLBAFV (ORCPT + 99 others); Tue, 1 Dec 2020 19:05:21 -0500 Received: from mail.kernel.org ([198.145.29.99]:56318 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727026AbgLBAFU (ORCPT ); Tue, 1 Dec 2020 19:05:20 -0500 X-Gm-Message-State: AOAM530prpCw4N5ttGP4K/rFDrPRngHVWpxXFFwLV7uk9wHBwNR78/b9 ihWDjzEuWkrcLZiU8IcWgfFTopCw21STk2gsZ6qWNQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1606867479; bh=MqZW0Kfi+WCzwtfCOe3WGgKsYtrojnTX6iECyZ6P9Xk=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=l7EZ3AnQeBR/r8NKChuI6/sqfxX0hbTSvhlgC0ZSLauwWe03sElSna+Z6JBWNaYLC qv7ko/LDWbmUXrC4OpiibjmJFiC8R/ER5S74YymnJ8L+Rsm51QqBriDO7KlL8yJqeF oL3OZBf5uX4vMl5myJwTKSK1ngI8BpjDKAJA3XDI= X-Received: by 2002:a1c:7e87:: with SMTP id z129mr99779wmc.176.1606867477970; Tue, 01 Dec 2020 16:04:37 -0800 (PST) MIME-Version: 1.0 References: <20201127193238.821364-1-krisman@collabora.com> <20201127193238.821364-5-krisman@collabora.com> In-Reply-To: <20201127193238.821364-5-krisman@collabora.com> From: Andy Lutomirski Date: Tue, 1 Dec 2020 16:04:24 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v8 4/7] entry: Support Syscall User Dispatch on common syscall entry To: Gabriel Krisman Bertazi Cc: Andrew Lutomirski , Thomas Gleixner , Kees Cook , Paul Gofman , Christian Brauner , Peter Zijlstra , Matthew Wilcox , Shuah Khan , LKML , Linux API , "open list:KERNEL SELFTEST FRAMEWORK" , X86 ML , kernel@collabora.com Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Nov 27, 2020 at 11:33 AM Gabriel Krisman Bertazi wrote: > > Syscall User Dispatch (SUD) must take precedence over seccomp and > ptrace, since the use case is emulation (it can be invoked with a > different ABI) such that seccomp filtering by syscall number doesn't > make sense in the first place. In addition, either the syscall is > dispatched back to userspace, in which case there is no resource for to > trace, or the syscall will be executed, and seccomp/ptrace will execute > next. > > Since SUD runs before tracepoints, it needs to be a SYSCALL_WORK_EXIT as > well, just to prevent a trace exit event when dispatch was triggered. > For that, the on_syscall_dispatch() examines context to skip the > tracepoint, audit and other work. > > Signed-off-by: Gabriel Krisman Bertazi > Acked-by: Peter Zijlstra (Intel) > --- > Changes since v6: > - Update do_syscall_intercept signature (Christian Brauner) > - Move it to before tracepoints > - Use SYSCALL_WORK flags > --- > include/linux/entry-common.h | 2 ++ > kernel/entry/common.c | 17 +++++++++++++++++ > 2 files changed, 19 insertions(+) > > diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h > index 49b26b216e4e..a6e98b4ba8e9 100644 > --- a/include/linux/entry-common.h > +++ b/include/linux/entry-common.h > @@ -44,10 +44,12 @@ > SYSCALL_WORK_SYSCALL_TRACE | \ > SYSCALL_WORK_SYSCALL_EMU | \ > SYSCALL_WORK_SYSCALL_AUDIT | \ > + SYSCALL_WORK_SYSCALL_USER_DISPATCH | \ > ARCH_SYSCALL_WORK_ENTER) > #define SYSCALL_WORK_EXIT (SYSCALL_WORK_SYSCALL_TRACEPOINT | \ > SYSCALL_WORK_SYSCALL_TRACE | \ > SYSCALL_WORK_SYSCALL_AUDIT | \ > + SYSCALL_WORK_SYSCALL_USER_DISPATCH | \ > ARCH_SYSCALL_WORK_EXIT) > > /* > diff --git a/kernel/entry/common.c b/kernel/entry/common.c > index f1b12dc32ff4..ec20aba3b890 100644 > --- a/kernel/entry/common.c > +++ b/kernel/entry/common.c > @@ -6,6 +6,8 @@ > #include > #include > > +#include "common.h" > + > #define CREATE_TRACE_POINTS > #include > > @@ -47,6 +49,16 @@ static long syscall_trace_enter(struct pt_regs *regs, long syscall, > { > long ret = 0; > > + /* > + * Handle Syscall User Dispatch. This must comes first, since > + * the ABI here can be something that doesn't make sense for > + * other syscall_work features. > + */ > + if (work & SYSCALL_WORK_SYSCALL_USER_DISPATCH) { > + if (do_syscall_user_dispatch(regs)) > + return -1L; > + } > + > /* Handle ptrace */ > if (work & (SYSCALL_WORK_SYSCALL_TRACE | SYSCALL_WORK_SYSCALL_EMU)) { > ret = arch_syscall_enter_tracehook(regs); > @@ -232,6 +244,11 @@ static void syscall_exit_work(struct pt_regs *regs, unsigned long work) > { > bool step; > > + if (work & SYSCALL_WORK_SYSCALL_USER_DISPATCH) { > + if (on_syscall_dispatch()) > + return; > + } I think this would be less confusing if you just open-coded the body of on_syscall_dispatch here and got rid of the helper. --Andy