Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751945AbdF1QH6 (ORCPT ); Wed, 28 Jun 2017 12:07:58 -0400 Received: from dispatch1-us1.ppe-hosted.com ([67.231.154.164]:40016 "EHLO dispatch1-us1.ppe-hosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751580AbdF1QHt (ORCPT ); Wed, 28 Jun 2017 12:07:49 -0400 Subject: Re: [PATCH v3 net-next 02/12] bpf/verifier: rework value tracking To: Daniel Borkmann , , "Alexei Starovoitov" , Alexei Starovoitov References: <2244b48b-f415-3239-6912-cb09f0abc546@solarflare.com> <5953C7FE.1050205@iogearbox.net> CC: , , iovisor-dev From: Edward Cree Message-ID: <71aae126-352d-d916-d64a-9d4045d0abe9@solarflare.com> Date: Wed, 28 Jun 2017 17:07:36 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.0 MIME-Version: 1.0 In-Reply-To: <5953C7FE.1050205@iogearbox.net> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.17.20.45] X-ClientProxiedBy: ocex03.SolarFlarecom.com (10.20.40.36) To ukex01.SolarFlarecom.com (10.17.10.4) X-TM-AS-Product-Ver: SMEX-11.0.0.1191-8.100.1062-23162.003 X-TM-AS-Result: No--17.305900-0.000000-31 X-TM-AS-User-Approved-Sender: Yes X-TM-AS-User-Blocked-Sender: No X-MDID: 1498666067-qNdmLwG5y27J Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4647 Lines: 97 On 28/06/17 16:15, Daniel Borkmann wrote: > On 06/27/2017 02:56 PM, Edward Cree wrote: >> Tracks value alignment by means of tracking known & unknown bits. >> Tightens some min/max value checks and fixes a couple of bugs therein. > > You mean the one in relation to patch 1/12? Would be good to elaborate > here since otherwise this gets forgotten few weeks later. That wasn't the only one; there were also some in the new min/max value calculation for ALU ops. For instance, in subtraction we were taking the new bounds as [min-min, max-max] instead of [min-max, max-min]. I can't remember what else there was and there might also have been some that I missed but that got incidentally fixed by the rewrite. But I guess I should change "checks" to "checks and updates" in the above? > Could you also document all the changes that verifier will then start > allowing for after the patch? Maybe not the changes, because the old verifier had a lot of special cases, but I could, and probably should, document the new behaviour (maybe in Documentation/networking/filter.txt, that already has a bit of description of the verifier). > [...] >> /* check whether memory at (regno + off) is accessible for t = (read | write) >> @@ -899,52 +965,79 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn >> struct bpf_reg_state *reg = &state->regs[regno]; >> int size, err = 0; >> >> - if (reg->type == PTR_TO_STACK) >> - off += reg->imm; >> - >> size = bpf_size_to_bytes(bpf_size); >> if (size < 0) >> return size; >> > [...] >> - if (reg->type == PTR_TO_MAP_VALUE || >> - reg->type == PTR_TO_MAP_VALUE_ADJ) { >> + /* for access checks, reg->off is just part of off */ >> + off += reg->off; > > Could you elaborate on why removing the reg->type == PTR_TO_STACK? Previously bpf_reg_state had a member 'imm' which, for PTR_TO_STACK, was a fixed offset, so we had to add it in to the offset. Now we instead have reg->off and it's generic to all pointerish types, so we don't need special handling of PTR_TO_STACK here. > Also in context of below PTR_TO_CTX. > > [...] >> } else if (reg->type == PTR_TO_CTX) { >> - enum bpf_reg_type reg_type = UNKNOWN_VALUE; >> + enum bpf_reg_type reg_type = SCALAR_VALUE; >> >> if (t == BPF_WRITE && value_regno >= 0 && >> is_pointer_value(env, value_regno)) { >> verbose("R%d leaks addr into ctx\n", value_regno); >> return -EACCES; >> } >> + /* ctx accesses must be at a fixed offset, so that we can >> + * determine what type of data were returned. >> + */ >> + if (!tnum_is_const(reg->var_off)) { >> + char tn_buf[48]; >> + >> + tnum_strn(tn_buf, sizeof(tn_buf), reg->var_off); >> + verbose("variable ctx access var_off=%s off=%d size=%d", >> + tn_buf, off, size); >> + return -EACCES; >> + } >> + off += reg->var_off.value; > > ... f.e. in PTR_TO_CTX case the only access that is currently > allowed is LDX/STX with fixed offset from insn->off, which is > passed as off param to check_mem_access(). Can you elaborate on > off += reg->var_off.value? Meaning we make this more dynamic > as long as access is known const? So, I can't actually figure out how to construct a pointer with a known variable offset, but future changes to the verifier (like learning from comparing two pointers with the same base) could make it possible. The situation we're handling here is where our register holds ctx + x, where x is also known to be some constant value k, and currently I don't know if that's possible except for the trivial case of k==0, and the edge case where k is too big to fit in the s32 reg->off (in which case the check_ctx_access will presumably reject it). Stepping back a bit, each register holding a pointer type has two offsets, reg->off and reg->var_off, and the latter is a tnum representing knowledge about a value that's not necessarily exactly known. But tnum_is_const checks that it _is_ exactly known. There is another case that we allow now through the reg->off handling: adding a constant to a pointer and then dereferencing it. So, with r1=ctx, instead of r2 = *(r1 + off), you can write r3 = r1 + off r2 = *(r1 + 0) if for some reason that suits you better. But in that case, because off is a known value (either an immediate, or a register whose value is exactly known), that value gets added to r3->off rather than r3->var_off; see adjust_ptr_min_max_vals(). Hope that's clear, -Ed