Message-ID: <595413AA.40502@iogearbox.net>
Date: Wed, 28 Jun 2017 22:38:02 +0200
From: Daniel Borkmann <daniel@iogearbox.net>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0
MIME-Version: 1.0
To: Edward Cree <ecree@solarflare.com>, davem@davemloft.net,
        Alexei Starovoitov <alexei.starovoitov@gmail.com>,
        Alexei Starovoitov <ast@fb.com>
CC: netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
        iovisor-dev <iovisor-dev@lists.iovisor.org>
Subject: Re: [PATCH v3 net-next 00/12] bpf: rewrite value tracking in verifier
References: <adc11342-737f-4e06-bce3-f0a92b5594a5@solarflare.com> <5953B436.6030506@iogearbox.net> <788035e1-1974-b48e-3008-d294194a8b05@solarflare.com>
In-Reply-To: <788035e1-1974-b48e-3008-d294194a8b05@solarflare.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2688
Lines: 46

On 06/28/2017 04:11 PM, Edward Cree wrote:
> On 28/06/17 14:50, Daniel Borkmann wrote:
>> Hi Edward,
>>
>> Did you also have a chance in the meantime to look at reducing complexity
>> along with your unification? I did run the cilium test suite with your
>> latest set from here and current # worst case processed insns that
>> verifier has to go through for cilium progs increases from ~53k we have
>> right now to ~76k. I'm a bit worried that this quickly gets us close to
>> the upper ~98k max limit starting to reject programs again. Alternative
>> is to bump the complexity limit again in near future once run into it,
>> but preferably there's a way to optimize it along with the rewrite? Do
>> you see any possibilities worth exploring?
> The trouble, I think, is that as we're now tracking more information about
>   each register value, we're less able to prune branches.  But often that
>   information is not actually being used in reaching the exit state.  So it

Agree.

>   seems like the way to tackle this would be to track what information is
>   used — or at least, which registers are read from (including e.g. writing
>   through them or passing them to helper calls) — in reaching a safe state.
>   Then only registers which are used are required to match for pruning.
> But that tracking would presumably have to propagate backwards through the
>   verifier stack, and I'm not sure how easily that could be done.  Someone
>   (was it you?) was talking about replacing the current DAG walking and
>   pruning with some kind of basic-block thing, which would help with this.
> Summary: I think it could be done, but I haven't looked into the details
>   of implementation yet; if it's not actually breaking your programs (yet),
>   maybe leave it for a followup patch series?

Could we adapt the limit to 128k perhaps as part of this set
given we know that we're tracking more meta data here anyway?
Then we could potentially avoid going via -stable later on,
biggest pain point is usually tracking differences in LLVM
code generation (e.g. differences in optimizations) along with
verifier changes to make sure that programs still keep loading
on older kernels with e.g. newer LLVM; one of the issues is that
pruning can be quite fragile. E.g. worst case adding a simple
var in a branch that LLVM assigns a stack slot that was otherwise
not used throughout the prog can cause a significant increase of
verifier work (run into this multiple times in the past and
is a bit of a pain to track down actually). If we could keep
some buffer in BPF_COMPLEXITY_LIMIT_INSNS at least when we know
that more work is needed anyway from that point onward, that
would be good.