Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759542AbcDERje (ORCPT ); Tue, 5 Apr 2016 13:39:34 -0400 Received: from mail-am1on0067.outbound.protection.outlook.com ([157.56.112.67]:33505 "EHLO emea01-am1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1758469AbcDERjZ (ORCPT ); Tue, 5 Apr 2016 13:39:25 -0400 Authentication-Results: spf=fail (sender IP is 12.216.194.146) smtp.mailfrom=ezchip.com; mellanox.com; dkim=none (message not signed) header.d=none;mellanox.com; dmarc=fail action=none header.from=mellanox.com; From: Chris Metcalf To: Gilad Ben Yossef , Steven Rostedt , Ingo Molnar , Peter Zijlstra , Andrew Morton , "Rik van Riel" , Tejun Heo , Frederic Weisbecker , Thomas Gleixner , "Paul E. McKenney" , Christoph Lameter , Viresh Kumar , Catalin Marinas , Will Deacon , Andy Lutomirski , , , CC: Chris Metcalf Subject: [PATCH v12 06/13] task_isolation: support PR_TASK_ISOLATION_STRICT mode Date: Tue, 5 Apr 2016 13:38:35 -0400 Message-ID: <1459877922-15512-7-git-send-email-cmetcalf@mellanox.com> X-Mailer: git-send-email 2.7.2 In-Reply-To: <1459877922-15512-1-git-send-email-cmetcalf@mellanox.com> References: <1459877922-15512-1-git-send-email-cmetcalf@mellanox.com> X-EOPAttributedMessage: 0 X-Forefront-Antispam-Report: CIP:12.216.194.146;IPV:NLI;CTRY:US;EFV:NLI;SFV:NSPM;SFS:(10009020)(6009001)(2980300002)(1110001)(1109001)(339900001)(189002)(199003)(107886002)(5001970100001)(6806005)(33646002)(5001770100001)(1096002)(87936001)(960300001)(11100500001)(106466001)(1220700001)(189998001)(5003940100001)(104016004)(50226001)(5008740100001)(19580395003)(19580405001)(4001430100002)(2201001)(50986999)(76176999)(586003)(42186005)(2950100001)(85426001)(2906002)(48376002)(105606002)(50466002)(86362001)(47776003)(4326007)(229853001)(92566002)(36756003)(921003)(1121003)(2101003)(83996005);DIR:OUT;SFP:1101;SCL:1;SRVR:VI1PR05MB1533;H:ld-1.internal.tilera.com;FPR:;SPF:Fail;MLV:sfv;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: 1;AM1FFO11FD037;1:rqnMULWK6BCuaaW21Mmni8RHD1LmLskmKfZoKPG4UAZln6jhkv51uldbT+k10T6yddL7C5McyxXkXU6OqmYYobLbOdBfkaYFdEj/s7oEpJfT+tvDlz+/MLQZz08qbYx9KihTGHMhhXCB3yzar8/HfwpErUBue0IkMzCVQGizw9C8SsFAkyQS0gNpjXuDP6BboqGOTGG9oQP8r30s9FWOJGzrRNxP/F6Qsjy/9LTaxacWrM/bakyE1vggFek1udj8ZaDp7Pg94mfUvNl7RlqDGEDILWXXGhVhRnJoLmEYg2exXJ7iIjBTiDB2itI+eiSru2AkekvZjbLJSp12xplB0TLFlzBe2CkpYtTf85dU1vMwUDjYOK0qCC25B9zmHm/sE3cnJa5Z+B+gAoY6z6FSWJIMGBGVRyS/BkIsYOhzbzLPjHKIAeFSu8AZ1qJrNpOQj5Q2uBPAtWgSIExqgTmfzmSWEKMV/NXHt6XStwTf0mZf/N1c30DvFn0+EVq6KgvaAMd08zvcY8JqcCLHOLnyZvLp+azWYOtXJ9wBwVRTcvyyS+5tsMAR74SS9TX7Ibh7gyBFHp+ENymgfTU7WNJH4fiRy1Ijb5lGeCaNvYmvUXOrHKc2auxCsTGI9iZxUwoNUf3t/L32YJKRQlXNVfqhXw== MIME-Version: 1.0 Content-Type: text/plain X-MS-Office365-Filtering-Correlation-Id: a6bae6a8-a213-4943-f1b8-08d35d793428 X-Microsoft-Exchange-Diagnostics: 1;VI1PR05MB1533;2:tFfb33Kum+8giT6BCcy9fiGEWREtk+kSlqkeWKpd++/25mwjK6saBM1YzYtJTQpDb/Qs5fheDGDt1lQxUa3mVoTNsUZpZG9xIIG3xVx326RX8sgQsj+nznspSZQ3hWw39zEuLHUU6SAqGYU6n+TddhE4PfzsVlFNqBC1MSQBN4ctCyPMRGzxEDax2QICnWFh;3:UoeWp/6XwzvC6+Tg1pH6VmaNR3lo5O53LACNfAMcHWT7VxCsFvEA69O/2UttJe43NIyMpLCUcS3C6j5m9zXdFRBybSlWzibTP7JC1/O3Z1CPo58/4i7SEQyzuBiIwY+lza2dQf+5lTVokQCZ9wsSJ1DwzvGGrjdqMvUguryvwFceY/Lb6OrkaeTGrseYchozbe5rx6vXAFWf4iDByhaZEuuBRaeUB/9UMalREGwvJeo= X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:VI1PR05MB1533; X-Microsoft-Exchange-Diagnostics: 1;VI1PR05MB1533;25:Ud5HHrKmOXBAwHqjlM/3cGIpXJsnqd5NlOLBwkmhFU1i84m2Xn2JhRFur7E/odNmtsKXYGgouTmc7xDWT1GDH6EKN2Ue62bHGA+Ylbe38Wg0zh4ATrVctYU5t2qXxfR98uLNbZZIbrvIMuemSRBgZQY4FK8jFYR4xkQ+wE+RZ8Mv6ch9XkcCu+eMWm9txWEEOmVJ/FFUEA1yAY2tkoyhPw4HeHH+6zX7SXhFFz/MgOCldyQK1Hnu2hwV91ydqQZwNgl/04Br6pJYrSiVlNLHk6JAkTPe1DgMKq38qlNBimlYiGVUDzGpcIVNzbqK8ziYHrL3vDS4vfpohc8ZuJMd2NfmNPGyZ1Dr5NSBmv16C9MX14IUIDwW0W61337E7xJGWPaD4q9AaDX1Rg/lhFHPskBy+ulwJnHdX+yHuNFLHc3hec0sA0kV1ifPsuTppBXYtaSqkOr/V75bLnotraLAav/ka+24SUDR+J7MzlBRXjaQwFNCyzHCHt32/5VG1t5tmSvEeHNbAkxfZtc0W3/yZ35LvqFyFPdssUOOPWAOY1k1HAngTWpNj7jb69Xb02Nxaj00bLAnNjSUbyho+O6J0ml6+tQEJx70Qbwh5bkjMEOn0j/9Do1iPjH6LTpwCTUFZwhE+q97gcztawAjShR5Za3nydfICC8KWSNjEQsv6/oGkHH3+NEyJVXvfzftX2qx X-Microsoft-Exchange-Diagnostics: 1;VI1PR05MB1533;20:lpGCl176gCP8f+Lb2T0TbSOUQECGx+Bbm6CZmfM73AFHbYK9UoP77K/ry+X/MbIggDJfPQyo5uKQ+ktpLqoFzyPu0f9tfsiBB+YugaxFmYjmOhImgpve/UFqVuJPdr6AnhbPslFeq5tMEZ89NKtCOVsf2oAoXMgHWz/vehn/SjR72bCqLqaHI04zsIE17TSuofjzOK93Ra+9Z8hdgs0obv4ucgsj4PuuRJN6Daj8VBJP/sFw7VrU8ZVsC7KA/zdJlnwf6BklUVEJC6Gdk0L9cmNtQE7ans8P9w0brUs+rOsbipIYrQnEEpCmuiX3FoWXleYcjhw5memn9phC7hnYG/K62D8pep0hWsdKA1BMCxQWWEYqJxIlydvvkMcwLAFJklX5KzyFmtmv/jI1goLBnc571vkcMrU7Otpyin28fxTQ3NaHkgnD5kGLgpotwWFtvcVeyyO5wmMJ44Cuz+aJOhCxKShS6BJtKlKJhp3bZjPBXSEACtyt0YdK5R6Rw1IG X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(2401047)(13015025)(8121501046)(5005006)(13018025)(13023025)(13017025)(13024025)(10201501046)(3002001);SRVR:VI1PR05MB1533;BCL:0;PCL:0;RULEID:;SRVR:VI1PR05MB1533; X-Microsoft-Exchange-Diagnostics: 1;VI1PR05MB1533;4:V1zK6YOZVkq5n5evA+z5+MBtb/DkrP8JhYLcAFjcQRS+2ZIx9tS4GVQ3aYNEC0WTRqqI+OxnRZl7Pb1oGZtT1AsSRxv+pR5kHmKABdF26jhDMg2FcHn0UXnbfVKapXeie6Lqx9pa4O7V9IdMK/Fh3S4Lg4ETJNLESwl75Y8AwLwqS+IvYgBt1n6RjCYYsaYFZZe9aBcf90yZzcj2dh3u01ihWsOLMagwPOeZVpUHy76Rhpfg0OrAy6FEW9ZUxwjeTlJPICV2Q2PNDqAXUSPf69btMRrB2uFFDz6q+XXneqx024w6QGX5mDOG0V+8HdtrhuQ5aomV75DXdyAb/sSN0HKpBgsu/9eOfszhi8snL9uFpG+BUGGR/qXqVgd1e5OECB64dHryDVdfQkzQRYx9qXWUpFhuY0h+A0u/oXzqc7tYyVM/tLw8h0ukkXiPsuOTHHdVLSqKu1WMC0D9REDtUw== X-Forefront-PRVS: 0903DD1D85 X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1;VI1PR05MB1533;23:uYKADxJbmPvFjeChYQgeQ9dtjlkMOpiaHQy5Qdi+E?= =?us-ascii?Q?N3v13yDlEEvw00DQ1z/NMCv0kMKA1/AQndtpe43YY+ezF3Jttd1vTbS22ImF?= =?us-ascii?Q?6kpv1vLr1DFPNdGnWWzzbWJRomRfrbkvBGKL2UvZ0/UlGgLJ0XP8UTm7uRkA?= =?us-ascii?Q?NAuRRa0x/oEkZrkhMcWfvKbVPNE4WwoMarui2PcH6y8+WenlE6P2hezT93+9?= =?us-ascii?Q?kpZi1ok8SmHcnpMfPJwpkcezfqRxL7iDqTfJxpaIr5M2+Wz0VRQsuTSTJwkk?= =?us-ascii?Q?a89TDf7TmgRM/FznblCJL5PPAT1tvmA/XGCxnu/JNqjhrQion5hZg1fAcCP1?= =?us-ascii?Q?CvgQWlrIu0W5Pv+hbOELwUCBDeZT93IfqnqELQ4EC9L7BQ3KKdLSPPh2MAZd?= =?us-ascii?Q?ZbWp4a9Q/z2oRaugeBuewPoJAgbkeRmBmMkVOHKZsJKKL5/+ZBihHTZYfEac?= =?us-ascii?Q?8Aadx/vP0FY647gBb7P5uyS1SggpWs5ofl4mJKE62PmROOPBeY5eFI7Yu9+C?= =?us-ascii?Q?nTVGW8oT8E7INZfK4FY7EBMMDctpJRIYz+mZfI0moq2R/zSJ3CXD1xEZSPPR?= =?us-ascii?Q?3uEimdyZA1mIhCgbw8u9f62mPTmB31NLPxV6DwlykMUGjREbChpxICgvGnMp?= =?us-ascii?Q?XivWoqekFC1w3OZC98ULqnPq/myoYhU1yVLKyuA9vNsgmzE+3VfTglRDBRHB?= =?us-ascii?Q?QxO7a9bdGYh6z3UjtH8dmzLSHnbislShXhfCQpa7ODCqEbCiFAAT+MRyZGTI?= =?us-ascii?Q?g/6/ItqX4fU6TCsnmOXR2PyYeamj3GPtbA2+KYB1E9EuDiheD1611azE7lyT?= =?us-ascii?Q?qfEq0XNvhbMT9+quPMTuQ4+kv4BmBwaap955VaLN1w3pY+whQ3z7h5XNSUY9?= =?us-ascii?Q?yaNX3A7zDEtDlwU5DyVOEk6BDvxGt2J2aYjdqo7/Q7tMqGyPosoe6ezc8JrL?= =?us-ascii?Q?/MWvSQqI1AZGwb/qyi2xphO+iVTuK9MLyukqOFHR7BuzWH/p6nPK0+L+e0s4?= =?us-ascii?Q?VTX/+iuElFgpjj0BaiMRRJioHOb10Du6C5MH9IretppsSIJBXnkFvhMfmc8X?= =?us-ascii?Q?DvwjsXYsx0I46L4Qeax4Rcu6c+wEJJb87/W5nk7jQUJUAwZFidI3HUcK+WjE?= =?us-ascii?Q?iS0WK5a/T79lmrdZ+Z9Y9xzDCtx9HqSZVFEuGGPCsG+oxQphSRI7WqpzqhZC?= =?us-ascii?Q?JQpL4/RG3Htv1IqFkPJ1HcehhGxq1P+OL44uSj0fR5fsGlvGNsiC/tmYQ=3D?= =?us-ascii?Q?=3D?= X-Microsoft-Exchange-Diagnostics: 1;VI1PR05MB1533;5:+ZWzweWgyxEt8yf4IjOQtSBthHPl2JICcgD77Ypl5u3fLW/HNHWQOrnjGCB/1Ezi6CYPXf1yURTLMs8NRZ6SFhG6m3+CacHpRyQ7UGsmn6xvmWJaqa5oCisvHTh30fI1neN3LnoCVUgrxl/a2lB0pA==;24:gl6BXHPeGcRWOIouP7WOAYR4BUvlVKA0Kp8ZdiTPVD2UqxduToap5ZIvp33LONN5sOr0xhyYaULtiMk74DLTteCQZpm+4VrhGgbpXM/a8Mk= X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 05 Apr 2016 17:39:13.2528 (UTC) X-MS-Exchange-CrossTenant-Id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=a652971c-7d2e-4d9b-a6a4-d149256f461b;Ip=[12.216.194.146];Helo=[ld-1.internal.tilera.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR05MB1533 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5223 Lines: 155 With task_isolation mode, the task is in principle guaranteed not to be interrupted by the kernel, but only if it behaves. In particular, if it enters the kernel via system call, page fault, or any of a number of other synchronous traps, it may be unexpectedly exposed to long latencies. Add a simple flag that puts the process into a state where any such kernel entry generates a signal. For system calls, this test is performed immediately before the SECCOMP test and causes the syscall to return immediately with ENOSYS. By default, the task is signalled with SIGKILL, but we add prctl() bits to support requesting a specific signal instead. To allow the state to be entered and exited, the syscall checking test ignores the prctl() syscall so that we can clear the bit again later, and ignores exit/exit_group to allow exiting the task without a pointless signal killing you as you try to do so. Signed-off-by: Chris Metcalf --- include/linux/isolation.h | 10 +++++++ include/uapi/linux/prctl.h | 3 ++ kernel/isolation.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 86 insertions(+) diff --git a/include/linux/isolation.h b/include/linux/isolation.h index 99b909462e64..eb78175ed811 100644 --- a/include/linux/isolation.h +++ b/include/linux/isolation.h @@ -36,6 +36,14 @@ static inline void task_isolation_set_flags(struct task_struct *p, clear_tsk_thread_flag(p, TIF_TASK_ISOLATION); } +extern int task_isolation_syscall(int nr); +extern void _task_isolation_exception(const char *fmt, ...); +#define task_isolation_exception(fmt, ...) \ + do { \ + if (current_thread_info()->flags & _TIF_TASK_ISOLATION) \ + _task_isolation_exception(fmt, ## __VA_ARGS__); \ + } while (0) + #else static inline void task_isolation_init(void) { } static inline bool task_isolation_possible(int cpu) { return false; } @@ -43,6 +51,8 @@ static inline bool task_isolation_ready(void) { return true; } static inline void task_isolation_enter(void) { } extern inline void task_isolation_set_flags(struct task_struct *p, unsigned int flags) { } +static inline int task_isolation_syscall(int nr) { return 0; } +static inline void task_isolation_exception(const char *fmt, ...) { } #endif #endif diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 67224df4b559..a5582ace987f 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -201,5 +201,8 @@ struct prctl_mm_map { #define PR_SET_TASK_ISOLATION 48 #define PR_GET_TASK_ISOLATION 49 # define PR_TASK_ISOLATION_ENABLE (1 << 0) +# define PR_TASK_ISOLATION_STRICT (1 << 1) +# define PR_TASK_ISOLATION_SET_SIG(sig) (((sig) & 0x7f) << 8) +# define PR_TASK_ISOLATION_GET_SIG(bits) (((bits) >> 8) & 0x7f) #endif /* _LINUX_PRCTL_H */ diff --git a/kernel/isolation.c b/kernel/isolation.c index b364182dd8e2..f44e90109472 100644 --- a/kernel/isolation.c +++ b/kernel/isolation.c @@ -11,6 +11,8 @@ #include #include #include +#include +#include #include "time/tick-sched.h" cpumask_var_t task_isolation_map; @@ -157,3 +159,74 @@ void task_isolation_enter(void) if (!tick_nohz_tick_stopped()) set_tsk_need_resched(current); } + +void task_isolation_interrupt(struct task_struct *task, const char *buf) +{ + siginfo_t info = {}; + int sig; + + pr_warn("%s/%d: task_isolation strict mode violated by %s\n", + task->comm, task->pid, buf); + + /* Get the signal number to use. */ + sig = PR_TASK_ISOLATION_GET_SIG(task->task_isolation_flags); + if (sig == 0) + sig = SIGKILL; + info.si_signo = sig; + + /* + * Turn off task isolation mode entirely to avoid spamming + * the process with signals. It can re-enable task isolation + * mode in the signal handler if it wants to. + */ + task_isolation_set_flags(task, 0); + + send_sig_info(sig, &info, task); +} + +/* + * This routine is called from any userspace exception that doesn't + * otherwise trigger a signal to the user process (e.g. simple page fault). + */ +void _task_isolation_exception(const char *fmt, ...) +{ + struct task_struct *task = current; + + /* RCU should have been enabled prior to this point. */ + RCU_LOCKDEP_WARN(!rcu_is_watching(), "kernel entry without RCU"); + + if (task->task_isolation_flags & PR_TASK_ISOLATION_STRICT) { + va_list args; + char buf[100]; + + va_start(args, fmt); + vsnprintf(buf, sizeof(buf), fmt, args); + va_end(args); + + task_isolation_interrupt(task, buf); + } +} + +/* + * This routine is called from syscall entry (with the syscall number + * passed in), and in STRICT mode prevents most syscalls from executing + * and raises a signal to notify the process. + */ +int task_isolation_syscall(int syscall) +{ + struct task_struct *task = current; + + if ((task->task_isolation_flags & PR_TASK_ISOLATION_STRICT) && + syscall != __NR_prctl && + syscall != __NR_exit && syscall != __NR_exit_group) { + char buf[20]; + + snprintf(buf, sizeof(buf), "syscall %d", syscall); + task_isolation_interrupt(task, buf); + + syscall_set_return_value(task, current_pt_regs(), ENOSYS, -1); + return -1; + } + + return 0; +} -- 2.7.2