Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp56357imm; Mon, 2 Jul 2018 07:36:30 -0700 (PDT) X-Google-Smtp-Source: AAOMgpcY0gRu/+TcK1PXxj5CPkZvY402pMl9CCXZTC9oJ23CVIkjgXZHv4K4zaWK6uiqcNkt+bbx X-Received: by 2002:a62:284a:: with SMTP id o71-v6mr25754842pfo.67.1530542190819; Mon, 02 Jul 2018 07:36:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1530542190; cv=none; d=google.com; s=arc-20160816; b=i0g8NclPV4OOX9jUkl+lOCquMhovix3pOxbetS/q9ZAxlBJ0epYBVRKr99avNt8RLM UPVj3yGVYcE9w+7R3prxTo35wf/JMlveYZ+l1E40qn0oIfrObsbwByexptzJE8XQHe4v RV9SfPtOaHlHOl75AfqFrEgQPLySL9ZiTD1adMaeiPweyZrijeepuu0S9iXR3iIHdl9p o4j7GbU6aPc9EgLijiJELr70Gieo37Oe24guLkd6AA1vDEAK0mwifUTfqciHNHWaCu0c qXFACtpRvVncKhbBFCoP68/TnWOpAbbYHZRcpMFa3UmnxzNII/Ds7tp+E1U7Sl7U9YZL Hqwg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:thread-index:thread-topic :content-transfer-encoding:mime-version:subject:references :in-reply-to:message-id:cc:to:from:date:dkim-signature:dkim-filter :arc-authentication-results; bh=5IPuu3BGwgE15PQNbTIeXfIUS3378KS18eqdKiTQyvE=; b=YOZRmAoBPFAU17lejw+xHTho9mlX9ahLm6smDXffzAxIXeXhNB5YG9MmQXwpE6ep98 3pKGxhX+b172BgBrNL+QwfCjx0BMpxdoAtLIScs+nk5VfZqZKOpVBTwM5fIwawIdU9kw +ZHg9/2N9FLellSY2v57cmkCF2yApYzwTtf/aX68vNBd1rdnRW94EIPLqOXaxaTlgkWA bV8tYb9h4MIGCM0uBOU5HnWNqaiMh3Kt29fIJFGzUnFgsf4MrQNmDYuEsumIL/Ej/q9m 3E7UH4tXPX2zMxlrwtZD3FKrqAJf4HZkTdN5S0Fu1ECKJtHu0hTgYib12dGXlx8gbAWE K0ZQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@efficios.com header.s=default header.b="YRW/BCSe"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p185-v6si11775672pga.476.2018.07.02.07.36.15; Mon, 02 Jul 2018 07:36:30 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@efficios.com header.s=default header.b="YRW/BCSe"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752114AbeGBOcu (ORCPT + 99 others); Mon, 2 Jul 2018 10:32:50 -0400 Received: from mail.efficios.com ([167.114.142.138]:45288 "EHLO mail.efficios.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751443AbeGBOcs (ORCPT ); Mon, 2 Jul 2018 10:32:48 -0400 Received: from localhost (ip6-localhost [IPv6:::1]) by mail.efficios.com (Postfix) with ESMTP id 2CE1222E1B8; Mon, 2 Jul 2018 10:32:47 -0400 (EDT) Received: from mail.efficios.com ([IPv6:::1]) by localhost (mail02.efficios.com [IPv6:::1]) (amavisd-new, port 10032) with ESMTP id wxv3nHNLgg2m; Mon, 2 Jul 2018 10:32:46 -0400 (EDT) Received: from localhost (ip6-localhost [IPv6:::1]) by mail.efficios.com (Postfix) with ESMTP id 8CB3522E1AA; Mon, 2 Jul 2018 10:32:46 -0400 (EDT) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.efficios.com 8CB3522E1AA DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=efficios.com; s=default; t=1530541966; bh=5IPuu3BGwgE15PQNbTIeXfIUS3378KS18eqdKiTQyvE=; h=Date:From:To:Message-ID:MIME-Version; b=YRW/BCSeJ9NWPaj/R7WD0zK/qhI45W41uq/xEKiiJKh323ZXtPzok3mkQGibwuGNe ZzwUuTjpUs58UdAwuJZhb/xck1xW46qQ/GSsVB4CizH/klnFit750MabcYG4/799Bm sY4PtgpQ1BjdBdvhVHUZ8DmQEUwLDBkn3VJlcZlihP9m2Y4cXnHGE73MeywEFxloWS w4g2mfWu47O1iv3DvuQnPm7IlZoDq2udG/rSrS1vuDMklV+YSRWaezS2709JMtVThy fa8J1rOG0qwYFOipjqN6ZLO1p2KQecX95a6U/4G8nJ5bOFO6Mmkogaq/RV0C0GFkLG I+47xfAlHuuNA== X-Virus-Scanned: amavisd-new at efficios.com Received: from mail.efficios.com ([IPv6:::1]) by localhost (mail02.efficios.com [IPv6:::1]) (amavisd-new, port 10026) with ESMTP id Aw-hCGYbBPgo; Mon, 2 Jul 2018 10:32:46 -0400 (EDT) Received: from mail02.efficios.com (mail02.efficios.com [167.114.142.138]) by mail.efficios.com (Postfix) with ESMTP id 632B722E191; Mon, 2 Jul 2018 10:32:46 -0400 (EDT) Date: Mon, 2 Jul 2018 10:32:46 -0400 (EDT) From: Mathieu Desnoyers To: Andy Lutomirski Cc: Linus Torvalds , Andy Lutomirski , Thomas Gleixner , linux-kernel , linux-api , Peter Zijlstra , "Paul E. McKenney" , Boqun Feng , Dave Watson , Paul Turner , Andrew Morton , Russell King , Ingo Molnar , "H. Peter Anvin" , Andi Kleen , Chris Lameter , Ben Maurer , rostedt , Josh Triplett , Catalin Marinas , Will Deacon , Michael Kerrisk , Joel Fernandes Message-ID: <1527399163.10673.1530541966296.JavaMail.zimbra@efficios.com> In-Reply-To: References: <20180628162359.9054-1-mathieu.desnoyers@efficios.com> <729451355.9702.1530284622326.JavaMail.zimbra@efficios.com> <247789350.9741.1530288432573.JavaMail.zimbra@efficios.com> <184287091.10022.1530301738384.JavaMail.zimbra@efficios.com> Subject: Re: [RFC PATCH for 4.18 1/2] rseq: validate rseq_cs fields are < TASK_SIZE MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [167.114.142.138] X-Mailer: Zimbra 8.8.8_GA_2096 (ZimbraWebClient - FF52 (Linux)/8.8.8_GA_1703) Thread-Topic: rseq: validate rseq_cs fields are < TASK_SIZE Thread-Index: EfNoEXYdmNtWTN5CDgNZeik/VS9bfA== Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org ----- On Jun 29, 2018, at 4:39 PM, Andy Lutomirski luto@amacapital.net wrote: > On Fri, Jun 29, 2018 at 12:48 PM, Mathieu Desnoyers > wrote: >> There are two aspects I'm concerned about here: >> >> 1) security: we don't want 32-bit user-space to feed a 64-bit value over 4GB >> as abort_ip that may end up causing OOPSes on architectures that would >> lack proper validation of those values on return to userspace. > > I'm not too worried about this. As long as you're doing it from > signal-delivery context (which you are AFAICT) you're fine. No, it's not just signal-delivery context. It's _also_ called from return to usermode loop, which can by called on return from interrupt/trap/syscall. > > But I re-read the code and I think I have a really straightforward > solution. Two choices: > > (1) Change instruction_pointer_set() to return an error code if the > address passed in is garbage in a way that could cause unexpected > behavior (like >=2^32 on x86_64 if regs->cs is 32-bit). It has very > very few callers. This would take care of my security concern wrt abort_ip, but would not provide consistent behavior for the other fields. Also, perhaps this kind of change should aim the next merge window ? > > (2) Add instruction_pointer_validate() to go along with > instruction_pointer_set(). > > That should be enough to solve the problem, right? This would only handle the "security" part of the matter, which is specifically related to rseq->rseq_cs->abort_ip. What is left is ensuring that we have consistent behavior for other fields: [ Note: we have introduced this helper macro: LINUX_FIELD_u32_u64 which defines a field which is 64-bit for 64-bit processes, and 32-bit with 32-bit of padding for 32-bit processes. ] * rseq->rseq_cs: (userspace pointer to user-space, updated by user-space with single-copy atomicity): current type: LINUX_FIELD_u32_u64, cannot be changed to __u64 due to single-copy atomicity requirement, * rseq->rseq_cs->start_ip: currently a LINUX_FIELD_u32_u64, could become a __u64, * rseq->rseq_cs->post_commit_ip: currently a LINUX_FIELD_u32_u64, could become a __u64, * rseq->rseq_cs->abort_ip: currently a LINUX_FIELD_u32_u64, could become a __u64, For abort_ip, changing the type to __u64 and using the instruction_pointer_validate() approach you propose would work. For start_ip and post_commit_ip, we need to decide whether we want to kill a 32-bit process setting the high bits or if we just accept and use the full __u64 content on both 32-bit and 64-bit kernels. Those two fields are only used for arithmetic comparison. Using the full __u64 content means using 64-bit arithmetic on 32-bit native kernels though. If we decide to kill the offending process (whether the field type is __u64 or LINUX_FIELD_u32_u64), we need to be able to figure out if the process is a compat task when called from signal delivery and from return to usermode loop (return from irq/trap/syscall). Comparison with TASK_SIZE solves the issue for abort_ip, post_commit_ip and start_ip, although nobody seems to like that approach very much. For rseq->rseq_cs, we cannot use __u64 due to single-copy atomicity update requirement for 32-bit processes. However, we are using this field in a copy_from_user(), so it will EFAULT if the high-bits are set by a compat 32-bit task on a 64-bit kernel. We can therefore check that the padding is zeroed explicitly on a native 32-bit kernel to provide a consistent behavior. Specifically because rseq->rseq_cs is checked with access_ok(), it is therefore enough to check the padding when __LP64__ is not defined by the preprocessor. But rather than trying to play games with input validation, I would favor an approach that would allow rseq to validate all its inputs straightforwardly. Introducing user_64bit_mode(struct pt_regs *) across all architectures would allow doing just that. rseq signal delivery and return to usermode code could then ensure that high bits are cleared by 32-bit tasks for all fields and thus provide a consistent behavior for 32-bit tasks running on 32-bit and 64-bit kernels. Thoughts ? Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com