Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752037AbdGGRpX (ORCPT ); Fri, 7 Jul 2017 13:45:23 -0400 Received: from mail-pf0-f193.google.com ([209.85.192.193]:33427 "EHLO mail-pf0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750848AbdGGRpV (ORCPT ); Fri, 7 Jul 2017 13:45:21 -0400 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: [iovisor-dev] [PATCH v3 net-next 02/12] bpf/verifier: rework value tracking From: Nadav Amit In-Reply-To: Date: Fri, 7 Jul 2017 10:45:18 -0700 Cc: davem@davemloft.net, Alexei Starovoitov , Alexei Starovoitov , Daniel Borkmann , netdev@vger.kernel.org, iovisor-dev , linux-kernel@vger.kernel.org Message-Id: <3A96914E-3009-4E19-B138-7A636A76D9C8@gmail.com> References: <2244b48b-f415-3239-6912-cb09f0abc546@solarflare.com> To: Edward Cree X-Mailer: Apple Mail (2.3273) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by nfs id v67HjR6h008141 Content-Length: 6426 Lines: 135 Edward Cree wrote: > On 06/07/17 22:21, Nadav Amit wrote: >> I find it a bit surprising that such huge changes that can affect security >> and robustness are performed in one patch. > In the first version of the series, this was two patches, with "feed > pointer-to-unknown-scalar casts into scalar ALU path" split out from the rest; > but that just caused a lot of code movement and confusing diffs, hence why I > folded it into the same patch. > As for the rest of it, I'm not sure it can be split up: I'm changing the > definition and semantics of a core data structure (struct bpf_reg_state) > and I don't see any reasonable intermediate steps that would even compile. > For instance, replacing reg->imm (with its two meanings of 'known value' or > 'number of leading zero bits') with reg->var_off necessitates replacing all > the arithmetic-handling code (e.g. evaluate_reg_imm_alu()). But then > var_off also replaces reg->aux_align[_off], so all the alignment-checking > code has to change as well. And changing what register state we track > means that the pruning code (states_equal()) has to change too. > As it is, this patch leaves some selftests failing and it takes _another_ > big patch (4/12 "track signed and unsigned min/max values") to get them > all to pass again. Thanks for your polite response to my rantings. I do understand the complexity, and I did not like the two meanings of reg->imm either (it took me some time to understand them). Yet, I really doubt anyone is capable of really reviewing such a big patch. Most likely people will not even start or pick to small details they know or care about. >> Personally, I cannot comprehend >> all of these changes. In addition, I think that it is valuable to describe >> in detail the bugs that the patch solves and when they were introduced. > Mostly this patch series isn't about fixing bugs (except those which were > exposed when the changes caused some selftests to fail). Instead, it's a > combination of refactoring and unifying so that (for instance) the rules > for pointer arithmetic and alignment checking are as similar as possible > for all pointer types rather than having special-case rules for each type. > This allows many (correct) programs which the verifier will currently > reject, and makes the overall description of the verifier's rules much > shorter and simpler. I have written a documentation patch explaining > these rules, which the next version of the patch series will include. Ok. But I doubt it is as useful as commenting in the code or writing in the commit message what exactly is addressed by the patch. And documentation does not really help others to rebase their patches on top of yours. > The diff _is_ hard to read, I accept that; I think `diff` was too eager to > interpolate and match single lines like '{' from completely unrelated > functions. It might be easier to just try applying the patch to a tree > and then reading the result, rather than trying to interpret the diff. I don’t know to review patches this way. Hopefully others do, but I doubt it. For me changes such as: > if (dst_reg->min_value != BPF_REGISTER_MIN_RANGE) > - dst_reg->min_value -= min_val; > + dst_reg->min_value -= max_val; are purely cryptic. What happened here? Was there a bug before and if so what are its implications? Why can’t it be in a separate patch? I also think that changes such as: > - s64 min_value; > - u64 max_value; [snip] > + s64 min_value; /* minimum possible (s64)value */ > + u64 max_value; /* maximum possible (u64)value */ Should have been avoided. Personally, I find this comment redundant (to say the least). It does not help to reduce the diff size. In this regard, I think that refactoring should have been done first and not together with the logic changes. As another example, changing UNKNOWN_VALUE to SCALAR_VALUE should have been a separate, easy to understand patch. >> I can bring up some concerns regarding inconsistencies in offset checks and >> size, not spilling SCALAR and my wish not to limit packet size. However, >> when the patch is that big, I think it is useless. > Please, do describe these concerns; what inconsistencies do you see? > The not spilling scalars, and the limit to packet size, are retained > from the existing code (is_spillable_regtype() and MAX_PACKET_OFF). > The latter is also needed for correctness in computing reg->range; > if 'pkt_ptr + offset' could conceivably overflow, then the result > could be < pkt_end without being a valid pointer into the packet. > We thus rely on the assumption that the packet pointer will never be > within MAX_PACKET_OFF of the top of the address space. (This > assumption is, again, carried over from the existing verifier.) I understand the limitations (I think). I agree that CONST being spillable is not directly related. As for the possible packet offsets/range: intentionally or not you do make some changes that push the 64k packet size limit even deeper into the code. While the packet size should be limited to avoid overflow, IIUC the requirement is that: 64 > log(n_insn) + log(MAX_PACKET_OFF) + 1 Such an assertion may be staticly checked (using BUILD_BUG_ON), but I don’t think should propagate into the entire code, especially in a non consistent way. For example: > struct bpf_reg_state { > enum bpf_reg_type type; > union { > - /* valid when type == CONST_IMM | PTR_TO_STACK | UNKNOWN_VALUE */ > - s64 imm; > - > - /* valid when type == PTR_TO_PACKET* */ > - struct { > - u16 off; > - u16 range; > - }; > + /* valid when type == PTR_TO_PACKET */ > + u32 range; I would (personally) prefer range (and offsets) to be u64. I could understand if you left the range as u16 (since packet size is limited to 64k). But why would it be u32? Or: > - if (off < 0 || size <= 0 || off + size > reg->range) { > + if (off < 0 || size <= 0 || off > MAX_PACKET_OFF || > + off + size > reg->range) { Why don’t you check (off + size > max(MAX_PACKET_OFF, reg->range))? Is there a reason you ignore size when comparing to MAX_PACKET_OFF ? Unfortunately, I could only look at changes that directly concern me. As for the style, I am not a huge fan of nested or multiple if’s. Switches are easier to understand (e.g., in adjust_ptr_min_max_vals() ) and arrays with flags (with information per instruction or register-type) may be even better. Nadav