Received: by 2002:ac0:b08d:0:0:0:0:0 with SMTP id l13csp2191623imc; Fri, 22 Feb 2019 20:52:11 -0800 (PST) X-Google-Smtp-Source: AHgI3IY7LeHOyxM2jlQW0U+GZfej+qVt3s1AuSdgBMTiq/hqbaiGTV6UWo8A48h6nSqqf6gsuSFi X-Received: by 2002:a65:6105:: with SMTP id z5mr7507054pgu.26.1550897531428; Fri, 22 Feb 2019 20:52:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1550897531; cv=none; d=google.com; s=arc-20160816; b=MUSuFgvMn641Awa3+ZagJOa4qPN9ErYaZGmA4JDKXZnYl59sBDLE1j847LTLGSzbkX kEod39Vecgqd+2+4hxLvwVPD8rEZhXU7e/JOTfkvn22iFd5ygkcoumTqWQ3H1WUr0K/C R8AHpBT0LufsgivoqZ4UHvqKKO168fJjSQgA8JqqAou6lUVemV6HEABKf7FAm/513643 0rnhBQaq4Q6Ul+RZioLsAt4dCP4a6IUJOPERFdGwrloyLLdyN7Mycg2x8DV9Uf4W0Uk6 6QuruUXBi41TRJA9zRuT7h2TBYfggn1X2gHYEF9tHRTExKcwqkroYlOG7ZGpczZVcLqM OdOw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=SyWTS3RF/W3SodoHu1cu3NEo+YFrB0nkuYOrnWIB2k0=; b=eDsENSE01Jxcs8zYTxTa0h5W7lBnYfroM7ertq5UrV8GJ7Ci71ECKFxWa6APOc1ljH PUnxdCCN7TT27Xa7Bg8mjtSspD6kWJ1axk7Gr069HRGy769b4MiwDpiDPfM2rwPKMJyG 91osu6nJZDCIVlbU4VypEbbmtbAmRLCd+7ECcsjlceaLYu+tugK5gfMoK03DktYt/A+7 QNDMylI9XJy4GPj0gB81uRR2ImEFWCP2yLMghJtZmrP0gQEpKjThuFIIatAhqXTjMC4R YlGRmbMP1jlVgnILNyAb5h9kBr1dR4qjmzAdLWR7GRg/dlQI3QHIPHEDpSQ004vSHkf8 P9yw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=RW6CV5sB; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i15si1228124pfa.270.2019.02.22.20.51.54; Fri, 22 Feb 2019 20:52:11 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=RW6CV5sB; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727689AbfBWEvd (ORCPT + 99 others); Fri, 22 Feb 2019 23:51:33 -0500 Received: from mail.kernel.org ([198.145.29.99]:50332 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725821AbfBWEvc (ORCPT ); Fri, 22 Feb 2019 23:51:32 -0500 Received: from devbox (NE2965lan1.rev.em-net.ne.jp [210.141.244.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 05C12206A3; Sat, 23 Feb 2019 04:51:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1550897490; bh=42v0VnQi3IqZwIOMtu8X9hxtLnOZgCOMiNjt33ERHwk=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=RW6CV5sBrpGvBAojuJoLgh0HNgTHAf6PUk6pSMaYQGqv693gJEaT4eevt4w8N/ObM WT/tFG6X7lVIT6k5Pvnq0zOxiMk4gT/fm1luImbhe4MX7VIPbp7yO7xDknetIrVNi2 IyivcLio7Y2Yo/+WKroQfUK8/cFOkXLluc+BU4fs= Date: Sat, 23 Feb 2019 13:51:26 +0900 From: Masami Hiramatsu To: Alexei Starovoitov Cc: Linus Torvalds , David Miller , Masami Hiramatsu , Steven Rostedt , Andy Lutomirski , Linux List Kernel Mailing , Ingo Molnar , Andrew Morton , stable , Changbin Du , Jann Horn , Kees Cook , Andrew Lutomirski , Daniel Borkmann , Netdev , bpf@vger.kernel.org Subject: Re: [PATCH 1/2 v2] kprobe: Do not use uaccess functions to access kernel memory that can fault Message-Id: <20190223135126.1722237ffda9c50e66fff135@kernel.org> In-Reply-To: <20190222235618.dxewmv5dukltaoxl@ast-mbp.dhcp.thefacebook.com> References: <20190222192703.epvgxghwybte7gxs@ast-mbp.dhcp.thefacebook.com> <20190222.133842.1637029078039923178.davem@davemloft.net> <20190222225103.o5rr5zr4fq77jdg4@ast-mbp.dhcp.thefacebook.com> <20190222235618.dxewmv5dukltaoxl@ast-mbp.dhcp.thefacebook.com> X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.31; x86_64-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 22 Feb 2019 15:56:20 -0800 Alexei Starovoitov wrote: > On Fri, Feb 22, 2019 at 03:16:35PM -0800, Linus Torvalds wrote: > > > > So a kernel pointer value of 0x12345678 could be a value kernel > > pointer pointing to some random kmalloc'ed kernel memory, and a user > > pointer value of 0x12345678 could be a valid _user_ pointer pointing > > to some user mapping. > > > > See? > > > > If you access a user pointer, you need to use a user accessor function > > (eg "get_user()"), while if you access a kernel pointer you need to > > just dereference it directly (unless you can't trust it, in which case > > you need to use a _different_ accessor function). > > that was clear already. > Reading 0x12345678 via probe_kernel_read can return valid value > and via get_user() can return another valid value on _some_ architectures. > > > The fact that user and kernel pointers happen to be distinct on x86-64 > > (right now) is just a random implementation detail. > > yes and my point that people already rely on this implementation detail. > Say we implement > int bpf_probe_read(void *val, void *unsafe_ptr) > { > if (probe_kernel_read(val, unsafe_ptr) == OK) { > return 0; > } else (get_user(val, unsafe_ptr) == OK) { > return 0; > } else { > *val = 0; > return -EFAULT; > } > } Note that we can not use get_user() form kprobe handler. If you use it, you have to prepare fault_handler() and make bpf itself can be aborted. So, maybe you can use probe_user_read(). Hmm, however, it still doesn't work correctly on "some" architecture, since whether a pointer (address) points user-space or kernel-space depends on the context. In kprobe/bpf, the context means where you put the probe and which pointer you record. I think only "__user" tag tells us which one is user-space. But unfortunately, that "__user" tag is only for compiler or checker, not for runtime binary. Such useful attribute goes away when we execute it. So, even if we introduce "ustring", ftrace/perf users has to decide to use it by themselves. As far as I know, DWARF(debuginfo) also doesn't have that attribute. So perf-probe can not help it from debuginfo. (Maybe if we introduce C parser, it might be detected...) > It will preserve existing bpf_probe_read() behavior on x86. > If x86 implementation changes tomorrow then progs that read user > addresses may start failing randomly because first probe_kernel_read() > will be returning random values from kernel memory and that's no good, > but at least we won't be breaking them today, so we have time to > introduce bpf_user_read and bpf_kernel_read and folks have time to adopt them. I see. I think bpf also has to introduce new bpf_probe_read_user() and keep bpf_probe_read() for kernel dataa only. > Imo that's much better than making current bpf_probe_read() fail > on user addresses today and not providing a non disruptive path forward. Agreed. Thank you, -- Masami Hiramatsu