Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751584AbdF1ToQ (ORCPT ); Wed, 28 Jun 2017 15:44:16 -0400 Received: from www62.your-server.de ([213.133.104.62]:35751 "EHLO www62.your-server.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751495AbdF1ToM (ORCPT ); Wed, 28 Jun 2017 15:44:12 -0400 Message-ID: <59540701.4010105@iogearbox.net> Date: Wed, 28 Jun 2017 21:44:01 +0200 From: Daniel Borkmann User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Edward Cree , davem@davemloft.net, Alexei Starovoitov , Alexei Starovoitov CC: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, iovisor-dev Subject: Re: [PATCH v3 net-next 02/12] bpf/verifier: rework value tracking References: <2244b48b-f415-3239-6912-cb09f0abc546@solarflare.com> <5953C7FE.1050205@iogearbox.net> <71aae126-352d-d916-d64a-9d4045d0abe9@solarflare.com> In-Reply-To: <71aae126-352d-d916-d64a-9d4045d0abe9@solarflare.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Authenticated-Sender: daniel@iogearbox.net Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5618 Lines: 117 On 06/28/2017 06:07 PM, Edward Cree wrote: > On 28/06/17 16:15, Daniel Borkmann wrote: >> On 06/27/2017 02:56 PM, Edward Cree wrote: >>> Tracks value alignment by means of tracking known & unknown bits. >>> Tightens some min/max value checks and fixes a couple of bugs therein. >> >> You mean the one in relation to patch 1/12? Would be good to elaborate >> here since otherwise this gets forgotten few weeks later. > That wasn't the only one; there were also some in the new min/max value > calculation for ALU ops. For instance, in subtraction we were taking > the new bounds as [min-min, max-max] instead of [min-max, max-min]. > I can't remember what else there was and there might also have been some > that I missed but that got incidentally fixed by the rewrite. But I > guess I should change "checks" to "checks and updates" in the above? Ok. Would be good though to have them all covered in the selftests part of your series if possible, so we can make sure to keep track of these cases. >> Could you also document all the changes that verifier will then start >> allowing for after the patch? > Maybe not the changes, because the old verifier had a lot of special > cases, but I could, and probably should, document the new behaviour > (maybe in Documentation/networking/filter.txt, that already has a bit > of description of the verifier). Yeah, that would definitely help; filter.txt should be fine. >> [...] >>> /* check whether memory at (regno + off) is accessible for t = (read | write) >>> @@ -899,52 +965,79 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn >>> struct bpf_reg_state *reg = &state->regs[regno]; >>> int size, err = 0; >>> >>> - if (reg->type == PTR_TO_STACK) >>> - off += reg->imm; >>> - >>> size = bpf_size_to_bytes(bpf_size); >>> if (size < 0) >>> return size; >>> >> [...] >>> - if (reg->type == PTR_TO_MAP_VALUE || >>> - reg->type == PTR_TO_MAP_VALUE_ADJ) { >>> + /* for access checks, reg->off is just part of off */ >>> + off += reg->off; >> >> Could you elaborate on why removing the reg->type == PTR_TO_STACK? > Previously bpf_reg_state had a member 'imm' which, for PTR_TO_STACK, was > a fixed offset, so we had to add it in to the offset. Now we instead > have reg->off and it's generic to all pointerish types, so we don't need > special handling of PTR_TO_STACK here. >> Also in context of below PTR_TO_CTX. >> >> [...] >>> } else if (reg->type == PTR_TO_CTX) { >>> - enum bpf_reg_type reg_type = UNKNOWN_VALUE; >>> + enum bpf_reg_type reg_type = SCALAR_VALUE; >>> >>> if (t == BPF_WRITE && value_regno >= 0 && >>> is_pointer_value(env, value_regno)) { >>> verbose("R%d leaks addr into ctx\n", value_regno); >>> return -EACCES; >>> } >>> + /* ctx accesses must be at a fixed offset, so that we can >>> + * determine what type of data were returned. >>> + */ >>> + if (!tnum_is_const(reg->var_off)) { >>> + char tn_buf[48]; >>> + >>> + tnum_strn(tn_buf, sizeof(tn_buf), reg->var_off); >>> + verbose("variable ctx access var_off=%s off=%d size=%d", >>> + tn_buf, off, size); >>> + return -EACCES; >>> + } >>> + off += reg->var_off.value; >> >> ... f.e. in PTR_TO_CTX case the only access that is currently >> allowed is LDX/STX with fixed offset from insn->off, which is >> passed as off param to check_mem_access(). Can you elaborate on >> off += reg->var_off.value? Meaning we make this more dynamic >> as long as access is known const? > So, I can't actually figure out how to construct a pointer with a known > variable offset, but future changes to the verifier (like learning from > comparing two pointers with the same base) could make it possible. The > situation we're handling here is where our register holds ctx + x, > where x is also known to be some constant value k, and currently I don't > know if that's possible except for the trivial case of k==0, and the edge > case where k is too big to fit in the s32 reg->off (in which case the > check_ctx_access will presumably reject it). > Stepping back a bit, each register holding a pointer type has two offsets, > reg->off and reg->var_off, and the latter is a tnum representing > knowledge about a value that's not necessarily exactly known. But > tnum_is_const checks that it _is_ exactly known. Right, I was reviewing this with the thought in mind where we could run into a pruning situation where in the first path we either add a scalar or offset to the ctx ptr that is then spilled to stack, later filled to a reg again with eventual successful exit. And the second path would prune on the spilled reg, but even if scalar, we require that it's a _known_ const whereas reading back from stack marks it unknown, so that is not possible. So all is fine; including your below example since it all has to be a _known_ scalar. > There is another case that we allow now through the reg->off handling: > adding a constant to a pointer and then dereferencing it. > So, with r1=ctx, instead of r2 = *(r1 + off), you can write > r3 = r1 + off > r2 = *(r1 + 0) > if for some reason that suits you better. But in that case, because off > is a known value (either an immediate, or a register whose value is > exactly known), that value gets added to r3->off rather than r3->var_off; > see adjust_ptr_min_max_vals(). > > Hope that's clear, Yep, thanks!