Received: by 2002:ac0:b08d:0:0:0:0:0 with SMTP id l13csp1983149imc; Fri, 22 Feb 2019 15:13:53 -0800 (PST) X-Google-Smtp-Source: AHgI3IbbElF2ii+g1Gmgpa4O7eQZKezR32oW8jFaaqMMxwP8fkMCwSUcjlcghOXoNuuww8krHpS9 X-Received: by 2002:a63:d507:: with SMTP id c7mr6058595pgg.105.1550877233048; Fri, 22 Feb 2019 15:13:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1550877233; cv=none; d=google.com; s=arc-20160816; b=OH9sv7MaGc2f8pawY9X9NajbhjqiNN4W7F0zKnGOEE/s0GYz6TJKv4t57Z+JBJGX6c GHBzJz/CFVfwrKVILiOXvQcU+hW1oLb+1E28CT1puRrszV/zEJibc27vDQnUurSsbpwO TGZJvM6YxzqOoDxJdU0uOrwOzPxtL2b5UMoAI9gkd+VTtoOoIW+4ppio3asmagYUGPSP di4A2yH7/RPaDKO9iihDMuMhI9OOoEgbTOO5FUHJEf5Wb2SCiyhaklPRgetarLjmG9EE 9qZLpK0f1F12cIJE6087gkGOdWBaeO2uQUjFWy91HS6tjM46WuC0h5KT0+Ig3qL/4oIf Jv3w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=mmDN9X/UEjCpZcgMNiozJuCrFz7Y2Sx2GWMobE40puc=; b=z0A5RG5gwbhEQ/T4CmsI+BIbmq5kGS/8u9caBP6/RAmzo+awF2/KhH/MnB92mv+xWV sttI9AGYT8aipjxr6KjCMulFpQJ+xTjLvna/EgBY1eEKxC1plUmZiWtuhYt0rklGro6q Z6In581dqY/82i04Ia8c3GM+sLE/D/rQzqU6eWvV4yZMpB9RnbL/eI267nZ8t4zrf5RA e08slw5Vd/bD2EWACLx800ZEKzB8+WpnQBaXBaJ7TrRXhtZxlHFp3B1mkfaSsxgcgJQ1 TFUc6NLLBaz/A/PajtAWLatY3o0KjblVIP23cR7QGwSyZdS4KD3MjN+tdYE8jCS5h3+6 aMmQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="i1WPsh8/"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h64si2358038pge.55.2019.02.22.15.13.36; Fri, 22 Feb 2019 15:13:53 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="i1WPsh8/"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727287AbfBVXM1 (ORCPT + 99 others); Fri, 22 Feb 2019 18:12:27 -0500 Received: from mail-ot1-f66.google.com ([209.85.210.66]:37314 "EHLO mail-ot1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727183AbfBVXM0 (ORCPT ); Fri, 22 Feb 2019 18:12:26 -0500 Received: by mail-ot1-f66.google.com with SMTP id b3so3309817otp.4 for ; Fri, 22 Feb 2019 15:12:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=mmDN9X/UEjCpZcgMNiozJuCrFz7Y2Sx2GWMobE40puc=; b=i1WPsh8/eSfOd3jLSd+Hkqkpp9cKP16/WqbUrEkH0rd9CBLUnQI8RGI18kbJKfcutS pyXX4xyPwb3K0O8QS0HE+DSbVvkXcF4s2ahIeUAiugJ9R/AXpG0Mit2uHPKf03+6FEDr 7ptTSA/DybTt6tClC0VLxCgLGxCZLqlDJlewbol0uJfZ+2/4DambWq7MlnG3r3SGM3Xf xnHVpEwW80rDzyATJzD3NEGS/2CByGmadtnJxv9A+r7uHdIxKtvYzZZrt6cqh7yCVnwn LwMzYXtnuIcXsvUFacWau0M8BvpO1jNOKpgV6azmvCHUQBPjoSSuxbqaUiHdn/f+bn1h TuLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=mmDN9X/UEjCpZcgMNiozJuCrFz7Y2Sx2GWMobE40puc=; b=ie6L7U+UGYu+CODll1G93bepTCjQ8rphLYTFEv2bh8qFEwfBcQ73vb2HLpc5u8Yvt2 nywE6ZRMW+NMtaT1QwBGsjDOanrF7NJw8Qm/P4sCZwML98A/FL3vT2aaWGmTGnxLmT09 LWoqXYl4IEn2HUX/8aNg9qvIWWyHNjvTVSeTIkmIa1Hu7n0vSGhQHODzSqkYfzeng2zZ vazJG/3/j/PXAWxfsrEpVg3ip3G90X5xkPhap95vHHvDni6XJ8DexKR7lnljoU0Spw0z OBfET2BvKZQobsNew5oPEOiROHShHG1NKHRZqfcawCR4YSOCIDWhK8zmfUosZWMqbnTN hV4g== X-Gm-Message-State: AHQUAuYM7kQlJt2wTkNP885UgevmxZJyE9czRDJMHC455BYf/VT7pkVi 5jZwwraLYyPnC2s05K4w7+gMlYpvpFtDkyJ8VLdV0Q== X-Received: by 2002:a9d:66d0:: with SMTP id t16mr4257261otm.35.1550877145204; Fri, 22 Feb 2019 15:12:25 -0800 (PST) MIME-Version: 1.0 References: <20190222192703.epvgxghwybte7gxs@ast-mbp.dhcp.thefacebook.com> <20190222.133842.1637029078039923178.davem@davemloft.net> <20190222225103.o5rr5zr4fq77jdg4@ast-mbp.dhcp.thefacebook.com> In-Reply-To: <20190222225103.o5rr5zr4fq77jdg4@ast-mbp.dhcp.thefacebook.com> From: Jann Horn Date: Sat, 23 Feb 2019 00:11:58 +0100 Message-ID: Subject: Re: [PATCH 1/2 v2] kprobe: Do not use uaccess functions to access kernel memory that can fault To: Alexei Starovoitov Cc: Linus Torvalds , David Miller , Masami Hiramatsu , Steven Rostedt , Andy Lutomirski , Linux List Kernel Mailing , Ingo Molnar , Andrew Morton , stable , Changbin Du , Kees Cook , Andrew Lutomirski , Daniel Borkmann , Netdev , bpf@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Feb 22, 2019 at 11:51 PM Alexei Starovoitov wrote: > > On Fri, Feb 22, 2019 at 01:59:10PM -0800, Linus Torvalds wrote: > > On Fri, Feb 22, 2019 at 1:38 PM David Miller wrote: > > > > > > Don't be surprised if we see more separation like this in the future too. > > > > Yes, with the whole meltdown fiasco, there's actually more pressure to > > add more support for separation of kernel/user address spaces. As Andy > > pointed out, it's been discussed as a future wish-list for x86-64 too. > > > > But yeah, right now the *common* architectures all distinguish kernel > > and user space by pointers (ie x86-64, arm64 and powerpc). > > That's all fine. I'm missing rationale for making probe_kernel_read() > fail on user addresses. > What is fundamentally wrong with a function probe_any_address_read() ? I think what Linus is saying is: There are some scenarios (like a system with the old 4G/4G X86 patch) where *the same* address can refer to two different pieces of memory, depending on whether you interpret it as a kernel pointer or a user pointer. So for example, if your BPF program tries to read tsk->comm, it works, but if the BPF program then tries to read from PT_REGS_PARM2(ctx), it's going to actually interpret the userspace address as a kernel address and read completely different memory. On top of that, from the security angle, this means that if a user passes a kernel pointer into a syscall, they can trick a tracing BPF program into looking at random kernel memory instead of the user's memory. That may or may not be problematic, depending on what you do afterwards with the data you've read. (For example, if this is a service that collects performance data and then saves it to some world-readable location on disk because the data it is collecting (including comm strings) isn't supposed to be sensitive, you might have a problem.) > For context, typical bpf kprobe program looks like this: > #define probe_read(P) \ > ({typeof(P) val = 0; bpf_probe_read(&val, sizeof(val), &P); val;}) > SEC("kprobe/__set_task_comm") > int bpf_prog(struct pt_regs *ctx) > { > struct signal_struct *signal; > struct task_struct *tsk; > char oldcomm[16] = {}; > char newcomm[16] = {}; > u16 oom_score_adj; > u32 pid; > > tsk = (void *)PT_REGS_PARM1(ctx); > pid = probe_read(tsk->pid); > bpf_probe_read(oldcomm, sizeof(oldcomm), &tsk->comm); > bpf_probe_read(newcomm, sizeof(newcomm), (void *)PT_REGS_PARM2(ctx)); > signal = probe_read(tsk->signal); > oom_score_adj = probe_read(signal->oom_score_adj); > ... > } > > where PT_REGS_PARMx macros are defined per architecture. > On x86 it's #define PT_REGS_PARM1(x) ((x)->di) > > The program writer has to know the meaning of function arguments. > In this example they need to know that __set_task_comm is defined as > void __set_task_comm(struct task_struct *tsk, const char *buf, bool exec) in the kernel. > > Right now these programs just call bpf_probe_read() on whatever data > they need to access and not differentiating whether it's user or kernel. > > One idea we discussed is to split bpf_probe_read() into kernel_read and user_read > helpers, but in the BPF verifier we cannot determine which address space > the program wants to access. The prog writer needs to manually analyze the program > to use correct one. But mistakes are possible and cannot be fatal. > On the kernel side we have to be safe. > Both probe_kernel_read and probe_user_read must not panic if a pointer > from wrong address space was passed. > > Hence my preference is to keep probe_kernel_read as "try read any address". > The function can be renamed to indicate so. >