Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp897463pxu; Wed, 2 Dec 2020 06:16:33 -0800 (PST) X-Google-Smtp-Source: ABdhPJy5DOmZEtKvWfeiwfoGfGZojVSAJq24RVoGoiIOJBreplGuO98UWidLXHUJNraE6W+uM86N X-Received: by 2002:a17:906:c096:: with SMTP id f22mr2430582ejz.488.1606918593366; Wed, 02 Dec 2020 06:16:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1606918593; cv=none; d=google.com; s=arc-20160816; b=DBVd8y65pCkCCUq9Dqdx4z75XQWt1ndTjgdUszmllj/vhvNsznGShzMO4u/NSgBp2s SX7ZdssIVEmtpBlkMdN+uDy6s7eeTjc8rYQBDZNAumefH7nS9+WYgBSGwuPyEd/jFWnn dsLVPuoEUiWvOW8IoUKBLy/hlA2lHLI+hHDzzA/bGcIvxC/tVYLjg8maF4zRVjxJW/ju sgRvMh/izxbjn34eVIOgA8kM13uwZWUtkKI7HvxVX4v+8JwZ7byGvsANTiYk1MpJTkwf PAWOh8bXuhx0j7/bWs1R2QhL7aPgVuariqgFtey5AZyfdrmIzSToP/WStOZInEKHvzVT KIpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:robot-unsubscribe :robot-id:message-id:mime-version:references:in-reply-to:cc:subject :to:reply-to:sender:from:dkim-signature:dkim-signature:date; bh=0xhFKn9cC3CvXtrrC2xV7RipX38G9ljTwnURORWDKp8=; b=NwRqrFzXFb/dQ6fAC2k5q4LvREWmXB1yVpB3/KeGFK+ft1GF6AtGRGA9gsmws0PSxk 8hiQUrFDCTrIO1uoDfz10euH6kKe1qac219KSiBz0qgxU9i7EU4l7WJ3zUvtm4KsFObc T/jhSD15JPyklhbR0xmbkubNOjdGSovhL88r3lnMerTtKsaDP1eCzmr4mYhkTk4mXUz2 osn6N/slUo1bbG+gx04rEoIA7mflB7uDg3iAZJFgZLhXeAYfNDY+1jEVG+o0b+956CQO VtLE1zykjoF4L0+Jl2x0tO8dH7vaFw7/mU2bIZ2SE5tYUZlXY+LdW/wgi8vq6AWm7Ril q+yQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=V8HZsyFf; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=r+6AbPRm; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g25si17007ejp.67.2020.12.02.06.15.59; Wed, 02 Dec 2020 06:16:33 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=V8HZsyFf; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=r+6AbPRm; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730320AbgLBONV (ORCPT + 99 others); Wed, 2 Dec 2020 09:13:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40124 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730117AbgLBONU (ORCPT ); Wed, 2 Dec 2020 09:13:20 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A45E2C061A47; Wed, 2 Dec 2020 06:12:05 -0800 (PST) Date: Wed, 02 Dec 2020 14:12:03 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1606918323; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0xhFKn9cC3CvXtrrC2xV7RipX38G9ljTwnURORWDKp8=; b=V8HZsyFfhKKEaCxN01jBWp/kwSW2yQdpO2rWHWy+YmsYqbTFqX2CWen5ROGEOVVCETcX2G sb5bjc2AstLoJS2k2Q7FjANdOhVP2sjAZLFxVuuR0F9U8bgsyhqSDDrH1G/U/ThqZwKkS4 Jg10Smmju/SbAXAooPj05lyOj14ArTtBEifDaeTlvbYI6Tb41QUHQM1JyfVAyBuQSvZ6v4 Mv1fDBAWLKg+kUJtDSQickxVBf8UNDftOsa10/tLPb4E4XaGLda0X3mWVL5DnwxzFNeprp b8AbmzJ7uDJMsnGmUGlIubaUB6nPpZvD8aQXFi80uXtGN/suXQv2x3iiEOqogg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1606918323; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0xhFKn9cC3CvXtrrC2xV7RipX38G9ljTwnURORWDKp8=; b=r+6AbPRmmCosmDA/1M/3Ub9++8iRb60gOWrWk9jAnHXq871HrZmuFEpj+EXkrIrUrnotOI UONpj0kyOnfrrzAw== From: "tip-bot2 for Gabriel Krisman Bertazi" Sender: tip-bot2@linutronix.de Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: core/entry] entry: Support Syscall User Dispatch on common syscall entry Cc: Gabriel Krisman Bertazi , Thomas Gleixner , Andy Lutomirski , "Peter Zijlstra (Intel)" , Kees Cook , x86@kernel.org, linux-kernel@vger.kernel.org In-Reply-To: <20201127193238.821364-5-krisman@collabora.com> References: <20201127193238.821364-5-krisman@collabora.com> MIME-Version: 1.0 Message-ID: <160691832324.3364.1175492009497252965.tip-bot2@tip-bot2> Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The following commit has been merged into the core/entry branch of tip: Commit-ID: 11894468e39def270199f845b76df6c36d4ed133 Gitweb: https://git.kernel.org/tip/11894468e39def270199f845b76df6c36d4ed133 Author: Gabriel Krisman Bertazi AuthorDate: Fri, 27 Nov 2020 14:32:35 -05:00 Committer: Thomas Gleixner CommitterDate: Wed, 02 Dec 2020 15:07:56 +01:00 entry: Support Syscall User Dispatch on common syscall entry Syscall User Dispatch (SUD) must take precedence over seccomp and ptrace, since the use case is emulation (it can be invoked with a different ABI) such that seccomp filtering by syscall number doesn't make sense in the first place. In addition, either the syscall is dispatched back to userspace, in which case there is no resource for to trace, or the syscall will be executed, and seccomp/ptrace will execute next. Since SUD runs before tracepoints, it needs to be a SYSCALL_WORK_EXIT as well, just to prevent a trace exit event when dispatch was triggered. For that, the on_syscall_dispatch() examines context to skip the tracepoint, audit and other work. [ tglx: Add a comment on the exit side ] Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Thomas Gleixner Reviewed-by: Andy Lutomirski Acked-by: Peter Zijlstra (Intel) Acked-by: Kees Cook Link: https://lore.kernel.org/r/20201127193238.821364-5-krisman@collabora.com --- include/linux/entry-common.h | 2 ++ kernel/entry/common.c | 25 +++++++++++++++++++++++++ 2 files changed, 27 insertions(+) diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h index 49b26b2..a6e98b4 100644 --- a/include/linux/entry-common.h +++ b/include/linux/entry-common.h @@ -44,10 +44,12 @@ SYSCALL_WORK_SYSCALL_TRACE | \ SYSCALL_WORK_SYSCALL_EMU | \ SYSCALL_WORK_SYSCALL_AUDIT | \ + SYSCALL_WORK_SYSCALL_USER_DISPATCH | \ ARCH_SYSCALL_WORK_ENTER) #define SYSCALL_WORK_EXIT (SYSCALL_WORK_SYSCALL_TRACEPOINT | \ SYSCALL_WORK_SYSCALL_TRACE | \ SYSCALL_WORK_SYSCALL_AUDIT | \ + SYSCALL_WORK_SYSCALL_USER_DISPATCH | \ ARCH_SYSCALL_WORK_EXIT) /* diff --git a/kernel/entry/common.c b/kernel/entry/common.c index 91e8fd5..e661e70 100644 --- a/kernel/entry/common.c +++ b/kernel/entry/common.c @@ -5,6 +5,8 @@ #include #include +#include "common.h" + #define CREATE_TRACE_POINTS #include @@ -46,6 +48,16 @@ static long syscall_trace_enter(struct pt_regs *regs, long syscall, { long ret = 0; + /* + * Handle Syscall User Dispatch. This must comes first, since + * the ABI here can be something that doesn't make sense for + * other syscall_work features. + */ + if (work & SYSCALL_WORK_SYSCALL_USER_DISPATCH) { + if (syscall_user_dispatch(regs)) + return -1L; + } + /* Handle ptrace */ if (work & (SYSCALL_WORK_SYSCALL_TRACE | SYSCALL_WORK_SYSCALL_EMU)) { ret = arch_syscall_enter_tracehook(regs); @@ -230,6 +242,19 @@ static void syscall_exit_work(struct pt_regs *regs, unsigned long work) { bool step; + /* + * If the syscall was rolled back due to syscall user dispatching, + * then the tracers below are not invoked for the same reason as + * the entry side was not invoked in syscall_trace_enter(): The ABI + * of these syscalls is unknown. + */ + if (work & SYSCALL_WORK_SYSCALL_USER_DISPATCH) { + if (unlikely(current->syscall_dispatch.on_dispatch)) { + current->syscall_dispatch.on_dispatch = false; + return; + } + } + audit_syscall_exit(regs); if (work & SYSCALL_WORK_SYSCALL_TRACEPOINT)