2021-06-17 13:32:01

by Marco Elver

[permalink] [raw]
Subject: Functional Coverage via RV? (was: "Learning-based Controlled Concurrency Testing")

[+Daniel, just FYI. We had a discussion about "functional coverage"
and fuzzing, and I've just seen your wonderful work on RV. If you have
thought about fuzzing with RV and how coverage of the model impacts
test generation, I'd be curious to hear.]

Looks like there is ongoing work on specifying models and running them
along with the kernel: https://lwn.net/Articles/857862/

Those models that are run alongside the kernel would have their own
coverage, and since there's a mapping between real code and model, a
fuzzer trying to reach new code in one or the other will ultimately
improve coverage for both.

Just wanted to document this here, because it seems quite relevant.
I'm guessing that "functional coverage" would indeed be a side-effect
of a good RV model?

Previous discussion below.

Thanks,
-- Marco

On Wed, 19 May 2021 at 22:24, Marco Elver <[email protected]> wrote:
> On Wed, 19 May 2021 at 20:53, Paul E. McKenney <[email protected]> wrote:
> > On Wed, May 19, 2021 at 11:02:43AM +0200, Marco Elver wrote:
> > > On Tue, 18 May 2021 at 22:42, Paul E. McKenney <[email protected]> wrote:
> > > [...]
> > > > > All the above sound like "functional coverage" to me, and could be
> > > > > implemented on top of a well-thought-out functional coverage API.
> > > > > Functional coverage is common in the hardware verification space to
> > > > > drive simulation and model checking; for example, functional coverage
> > > > > could be "buffer is full" vs just structural (code) coverage which
> > > > > cannot capture complex state properties like that easily.
> > > > >
> > > > > Similarly, you could then say things like "number of held locks" or
> > > > > even alluding to your example (5) above, "observed race on address
> > > > > range". In the end, with decent functional coverage abstractions,
> > > > > anything should hopefully be possible.
> > > >
> > > > Those were in fact the lines along which I was thinking.
> > > >
> > > > > I've been wondering if this could be something useful for the Linux
> > > > > kernel, but my guess has always been that it'd not be too-well
> > > > > received because people don't like to see strange annotations in their
> > > > > code. But maybe I'm wrong.
> > > >
> > > > I agree that it is much easier to get people to use a tool that does not
> > > > require annotations. In fact, it is best if it requires nothing at all
> > > > from them...
> > >
> > > While I'd like to see something like that, because it'd be beneficial
> > > to see properties of the code written down to document its behaviour
> > > better and at the same time machine checkable, like you say, if it
> > > requires additional effort, it's a difficult sell. (Although the same
> > > is true for all other efforts to improve reliability that require a
> > > departure from the "way it used to be done", be it data_race(), or
> > > even efforts introducing whole new programming languages to the
> > > kernel.)
> >
> > Fair point! But what exactly did you have in mind?
>
> Good question, I'll try to be more concrete -- most of it are
> half-baked ideas and questions ;-), but if any of it makes sense, I
> should maybe write a doc to summarize.
>
> What I had in mind is a system to write properties for both functional
> coverage, but also checking more general properties of the kernel. The
> latter I'm not sure about how useful. But all this isn't really used
> for anything other than in debug builds.
>
> Assume we start with macros such as "ASSERT_COVER(...)" (for
> functional coverage) and "ASSERT(...)" (just plain-old assertions).
> The former is a way to document potentially interesting states (useful
> for fuzzers to reach them), and the latter just a way to just specify
> properties of the system (useful for finding the actual bugs).
> Implementation-wise the latter is trivial, the former requires some
> thought on how to expose that information to fuzzers and how to use
> (as Dmitry suggested it's not trivial). I'd also imagine we can have
> module-level variants ("GLOBAL_ASSERT*(...)") that monitor some global
> state, and also add support for some subset of temporal properties
> like "GLOBAL_ASSERT_EVENTUALLY(precond, eventually_holds)" as
> suggested below.
>
> I guess maybe I'd have to take a step back and just ask why we have no
> way to write plain and simple assertions that are removed in non-debug
> builds? Some subsystems seem to roll their own, which a 'git grep
> "#define ASSERT"' tells me.
>
> Is there a fundamental reason why we shouldn't have them, perhaps
> there was some past discussion? Today we have things like
> lockdep_assert_held(), but nothing to even write a simple assert
> otherwise. If I had to guess why something like ASSERT is bad, it is
> because it gives people a way to check for unexpected conditions, but
> if those checks disappear in non-debug builds, the kernel might be
> unstable. Therefore every possible state must be handled and we must
> always be able to recover. The argument in favor is, if the ASSERT()s
> are proven invariants or conditions where we'd recover either way, and
> are only there to catch accidental regressions during testing; and in
> non-debug builds we don't suffer the performance overheads.
..
> > > > > My ideal abstractions I've been thinking of isn't just for coverage,
> > > > > but to also capture temporal properties (which should be inspired by
> > > > > something like LTL or such), on top of which you can also build
> > > > > coverage. Then we can specify things like "if I observe some state X,
> > > > > then eventually we observe state Y", and such logic can also just be
> > > > > used to define functional coverage of interest (again all this
> > > > > inspired by what's already done in hardware verification).
> > > >
> > > > Promela/spin provides an LTL interface, but of course cannot handle
> > > > much of RCU, let alone of the entire kernel. And LTL can be quite
> > > > useful. But in a runtime system, how do you decide when "eventually"
> > > > has arrived? The lockdep system does so by tracking entry to idle
> > > > and to userspace execution, along with exit from interrupt handlers.
> > > > Or did you have something else in mind?
> > >
> > > For coverage, one could simply await the transition to the "eventually
> > > state" indefinitely; once reached we have coverage.
> > >
> > > But for verification, because unlike explicit state model checkers
> > > like Spin, we don't have the complete state and can't build an
> > > exhaustive state-graph, we'd have to approximate. And without knowing
> > > exactly what it is we're waiting for, the simplest option would be to
> > > just rely on a timeout, either part of the property or implicit. What
> > > the units of that timeout are I'm not sure, because a system might
> > > e.g. be put to sleep.


Subject: Re: Functional Coverage via RV? (was: "Learning-based Controlled Concurrency Testing")

On 6/17/21 1:20 PM, Marco Elver wrote:
> [+Daniel, just FYI. We had a discussion about "functional coverage"
> and fuzzing, and I've just seen your wonderful work on RV. If you have
> thought about fuzzing with RV and how coverage of the model impacts
> test generation, I'd be curious to hear.]

One aspect of RV is that we verify the actual execution of the system instead of
a complete model of the system, so we depend of the testing to cover all the
aspects of the system <-> model.

There is a natural relation with testing/fuzzing & friends with RV.

> Looks like there is ongoing work on specifying models and running them
> along with the kernel: https://lwn.net/Articles/857862/
>
> Those models that are run alongside the kernel would have their own
> coverage, and since there's a mapping between real code and model, a
> fuzzer trying to reach new code in one or the other will ultimately
> improve coverage for both.

Perfect!

> Just wanted to document this here, because it seems quite relevant.
> I'm guessing that "functional coverage" would indeed be a side-effect
> of a good RV model?

So, let me see if I understood the terms. Functional coverage is a way to check
if all the desired aspects of a code/system/subsystem/functionality were covered
by a set of tests?

If that is correct, we could use RV to:

- create an explicit model of the states we want to cover.
- check if all the desired states were visited during testing.

?

-- Daniel

> Previous discussion below.
>
> Thanks,
> -- Marco
>
> On Wed, 19 May 2021 at 22:24, Marco Elver <[email protected]> wrote:
>> On Wed, 19 May 2021 at 20:53, Paul E. McKenney <[email protected]> wrote:
>>> On Wed, May 19, 2021 at 11:02:43AM +0200, Marco Elver wrote:
>>>> On Tue, 18 May 2021 at 22:42, Paul E. McKenney <[email protected]> wrote:
>>>> [...]
>>>>>> All the above sound like "functional coverage" to me, and could be
>>>>>> implemented on top of a well-thought-out functional coverage API.
>>>>>> Functional coverage is common in the hardware verification space to
>>>>>> drive simulation and model checking; for example, functional coverage
>>>>>> could be "buffer is full" vs just structural (code) coverage which
>>>>>> cannot capture complex state properties like that easily.
>>>>>>
>>>>>> Similarly, you could then say things like "number of held locks" or
>>>>>> even alluding to your example (5) above, "observed race on address
>>>>>> range". In the end, with decent functional coverage abstractions,
>>>>>> anything should hopefully be possible.
>>>>>
>>>>> Those were in fact the lines along which I was thinking.
>>>>>
>>>>>> I've been wondering if this could be something useful for the Linux
>>>>>> kernel, but my guess has always been that it'd not be too-well
>>>>>> received because people don't like to see strange annotations in their
>>>>>> code. But maybe I'm wrong.
>>>>>
>>>>> I agree that it is much easier to get people to use a tool that does not
>>>>> require annotations. In fact, it is best if it requires nothing at all
>>>>> from them...
>>>>
>>>> While I'd like to see something like that, because it'd be beneficial
>>>> to see properties of the code written down to document its behaviour
>>>> better and at the same time machine checkable, like you say, if it
>>>> requires additional effort, it's a difficult sell. (Although the same
>>>> is true for all other efforts to improve reliability that require a
>>>> departure from the "way it used to be done", be it data_race(), or
>>>> even efforts introducing whole new programming languages to the
>>>> kernel.)
>>>
>>> Fair point! But what exactly did you have in mind?
>>
>> Good question, I'll try to be more concrete -- most of it are
>> half-baked ideas and questions ;-), but if any of it makes sense, I
>> should maybe write a doc to summarize.
>>
>> What I had in mind is a system to write properties for both functional
>> coverage, but also checking more general properties of the kernel. The
>> latter I'm not sure about how useful. But all this isn't really used
>> for anything other than in debug builds.
>>
>> Assume we start with macros such as "ASSERT_COVER(...)" (for
>> functional coverage) and "ASSERT(...)" (just plain-old assertions).
>> The former is a way to document potentially interesting states (useful
>> for fuzzers to reach them), and the latter just a way to just specify
>> properties of the system (useful for finding the actual bugs).
>> Implementation-wise the latter is trivial, the former requires some
>> thought on how to expose that information to fuzzers and how to use
>> (as Dmitry suggested it's not trivial). I'd also imagine we can have
>> module-level variants ("GLOBAL_ASSERT*(...)") that monitor some global
>> state, and also add support for some subset of temporal properties
>> like "GLOBAL_ASSERT_EVENTUALLY(precond, eventually_holds)" as
>> suggested below.
>>
>> I guess maybe I'd have to take a step back and just ask why we have no
>> way to write plain and simple assertions that are removed in non-debug
>> builds? Some subsystems seem to roll their own, which a 'git grep
>> "#define ASSERT"' tells me.
>>
>> Is there a fundamental reason why we shouldn't have them, perhaps
>> there was some past discussion? Today we have things like
>> lockdep_assert_held(), but nothing to even write a simple assert
>> otherwise. If I had to guess why something like ASSERT is bad, it is
>> because it gives people a way to check for unexpected conditions, but
>> if those checks disappear in non-debug builds, the kernel might be
>> unstable. Therefore every possible state must be handled and we must
>> always be able to recover. The argument in favor is, if the ASSERT()s
>> are proven invariants or conditions where we'd recover either way, and
>> are only there to catch accidental regressions during testing; and in
>> non-debug builds we don't suffer the performance overheads.
> ..
>>>>>> My ideal abstractions I've been thinking of isn't just for coverage,
>>>>>> but to also capture temporal properties (which should be inspired by
>>>>>> something like LTL or such), on top of which you can also build
>>>>>> coverage. Then we can specify things like "if I observe some state X,
>>>>>> then eventually we observe state Y", and such logic can also just be
>>>>>> used to define functional coverage of interest (again all this
>>>>>> inspired by what's already done in hardware verification).
>>>>>
>>>>> Promela/spin provides an LTL interface, but of course cannot handle
>>>>> much of RCU, let alone of the entire kernel. And LTL can be quite
>>>>> useful. But in a runtime system, how do you decide when "eventually"
>>>>> has arrived? The lockdep system does so by tracking entry to idle
>>>>> and to userspace execution, along with exit from interrupt handlers.
>>>>> Or did you have something else in mind?
>>>>
>>>> For coverage, one could simply await the transition to the "eventually
>>>> state" indefinitely; once reached we have coverage.
>>>>
>>>> But for verification, because unlike explicit state model checkers
>>>> like Spin, we don't have the complete state and can't build an
>>>> exhaustive state-graph, we'd have to approximate. And without knowing
>>>> exactly what it is we're waiting for, the simplest option would be to
>>>> just rely on a timeout, either part of the property or implicit. What
>>>> the units of that timeout are I'm not sure, because a system might
>>>> e.g. be put to sleep.
>

2021-06-18 12:42:34

by Marco Elver

[permalink] [raw]
Subject: Re: Functional Coverage via RV? (was: "Learning-based Controlled Concurrency Testing")

On Fri, Jun 18, 2021 at 09:58AM +0200, Daniel Bristot de Oliveira wrote:
> On 6/17/21 1:20 PM, Marco Elver wrote:
> > [+Daniel, just FYI. We had a discussion about "functional coverage"
> > and fuzzing, and I've just seen your wonderful work on RV. If you have
> > thought about fuzzing with RV and how coverage of the model impacts
> > test generation, I'd be curious to hear.]
>
> One aspect of RV is that we verify the actual execution of the system instead of
> a complete model of the system, so we depend of the testing to cover all the
> aspects of the system <-> model.
>
> There is a natural relation with testing/fuzzing & friends with RV.
>
> > Looks like there is ongoing work on specifying models and running them
> > along with the kernel: https://lwn.net/Articles/857862/
> >
> > Those models that are run alongside the kernel would have their own
> > coverage, and since there's a mapping between real code and model, a
> > fuzzer trying to reach new code in one or the other will ultimately
> > improve coverage for both.
>
> Perfect!
>
> > Just wanted to document this here, because it seems quite relevant.
> > I'm guessing that "functional coverage" would indeed be a side-effect
> > of a good RV model?
>
> So, let me see if I understood the terms. Functional coverage is a way to check
> if all the desired aspects of a code/system/subsystem/functionality were covered
> by a set of tests?

Yes, unlike code/structural coverage (which is what we have today via
KCOV) functional coverage checks if some interesting states were reached
(e.g. was buffer full/empty, did we observe transition a->b etc.).

Functional coverage is common in hardware verification, but of course
software verification would benefit just as much -- just haven't seen it
used much in practice yet.
[ Example for HW verification: https://www.chipverify.com/systemverilog/systemverilog-functional-coverage ]

It still requires some creativity from the designer/developer to come up
with suitable functional coverage. State explosion is a problem, too,
and naturally it is impractical to capture all possible states ... after
all, functional coverage is meant to direct the test generator/fuzzer
into more interesting states -- we're not doing model checking after all.

> If that is correct, we could use RV to:
>
> - create an explicit model of the states we want to cover.
> - check if all the desired states were visited during testing.
>
> ?

Yes, pretty much. On one hand there could be an interface to query if
all states were covered, but I think this isn't useful out-of-the box.
Instead, I was thinking we can simply get KCOV to help us out: my
hypothesis is that most of this would happen automatically if dot2k's
generated code has distinct code paths per transition.

If KCOV covers the RV model (since it's executable kernel C code), then
having distinct code paths for "state transitions" will effectively give
us functional coverage indirectly through code coverage (via KCOV) of
the RV model.

From what I can tell this doesn't quite happen today, because
automaton::function is a lookup table as an array. Could this just
become a generated function with a switch statement? Because then I
think we'd pretty much have all the ingredients we need.

Then:

1. Create RV models for states of interests not covered by normal code
coverage of code under test.

2. Enable KCOV for everything.

3. KCOV's coverage of the RV model will tell us if we reached the
desired "functional coverage" (and can be used by e.g. syzbot to
generate better tests without any additional changes because it
already talks to KCOV).

Thoughts?

Thanks,
-- Marco

2021-06-19 20:33:24

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: Functional Coverage via RV? (was: "Learning-based Controlled Concurrency Testing")

On Fri, Jun 18, 2021 at 1:26 PM Marco Elver <[email protected]> wrote:
>
> On Fri, Jun 18, 2021 at 09:58AM +0200, Daniel Bristot de Oliveira wrote:
> > On 6/17/21 1:20 PM, Marco Elver wrote:
> > > [+Daniel, just FYI. We had a discussion about "functional coverage"
> > > and fuzzing, and I've just seen your wonderful work on RV. If you have
> > > thought about fuzzing with RV and how coverage of the model impacts
> > > test generation, I'd be curious to hear.]
> >
> > One aspect of RV is that we verify the actual execution of the system instead of
> > a complete model of the system, so we depend of the testing to cover all the
> > aspects of the system <-> model.
> >
> > There is a natural relation with testing/fuzzing & friends with RV.
> >
> > > Looks like there is ongoing work on specifying models and running them
> > > along with the kernel: https://lwn.net/Articles/857862/
> > >
> > > Those models that are run alongside the kernel would have their own
> > > coverage, and since there's a mapping between real code and model, a
> > > fuzzer trying to reach new code in one or the other will ultimately
> > > improve coverage for both.
> >
> > Perfect!
> >
> > > Just wanted to document this here, because it seems quite relevant.
> > > I'm guessing that "functional coverage" would indeed be a side-effect
> > > of a good RV model?
> >
> > So, let me see if I understood the terms. Functional coverage is a way to check
> > if all the desired aspects of a code/system/subsystem/functionality were covered
> > by a set of tests?
>
> Yes, unlike code/structural coverage (which is what we have today via
> KCOV) functional coverage checks if some interesting states were reached
> (e.g. was buffer full/empty, did we observe transition a->b etc.).
>
> Functional coverage is common in hardware verification, but of course
> software verification would benefit just as much -- just haven't seen it
> used much in practice yet.
> [ Example for HW verification: https://www.chipverify.com/systemverilog/systemverilog-functional-coverage ]
>
> It still requires some creativity from the designer/developer to come up
> with suitable functional coverage. State explosion is a problem, too,
> and naturally it is impractical to capture all possible states ... after
> all, functional coverage is meant to direct the test generator/fuzzer
> into more interesting states -- we're not doing model checking after all.
>
> > If that is correct, we could use RV to:
> >
> > - create an explicit model of the states we want to cover.
> > - check if all the desired states were visited during testing.
> >
> > ?
>
> Yes, pretty much. On one hand there could be an interface to query if
> all states were covered, but I think this isn't useful out-of-the box.
> Instead, I was thinking we can simply get KCOV to help us out: my
> hypothesis is that most of this would happen automatically if dot2k's
> generated code has distinct code paths per transition.
>
> If KCOV covers the RV model (since it's executable kernel C code), then
> having distinct code paths for "state transitions" will effectively give
> us functional coverage indirectly through code coverage (via KCOV) of
> the RV model.
>
> From what I can tell this doesn't quite happen today, because
> automaton::function is a lookup table as an array. Could this just
> become a generated function with a switch statement? Because then I
> think we'd pretty much have all the ingredients we need.
>
> Then:
>
> 1. Create RV models for states of interests not covered by normal code
> coverage of code under test.
>
> 2. Enable KCOV for everything.
>
> 3. KCOV's coverage of the RV model will tell us if we reached the
> desired "functional coverage" (and can be used by e.g. syzbot to
> generate better tests without any additional changes because it
> already talks to KCOV).
>
> Thoughts?

I think there is usually already some code for any important state
transitions. E.g. I can't imagine how a socket can transition to
active/listen/shutdown/closed states w/o any code.

I see RV to be potentially more useful for the "coverage dimensions"
idea. I.e. for sockets that would be treating coverage for a socket
function X as different coverage based on the current socket state,
effectively consider (PC,state) as feedback signal.
But my concern is that we don't want to simply consider combinations
of all kernel code multiplied by all combinations of states of all RV
models. Most likely this will lead to severe feedback signal
explosion. So the question is: how do we understand that the socket
model relates only to this restricted set of code?

Subject: Re: Functional Coverage via RV? (was: "Learning-based Controlled Concurrency Testing")

On 6/18/21 1:26 PM, Marco Elver wrote:
> On Fri, Jun 18, 2021 at 09:58AM +0200, Daniel Bristot de Oliveira wrote:
>> On 6/17/21 1:20 PM, Marco Elver wrote:
>>> [+Daniel, just FYI. We had a discussion about "functional coverage"
>>> and fuzzing, and I've just seen your wonderful work on RV. If you have
>>> thought about fuzzing with RV and how coverage of the model impacts
>>> test generation, I'd be curious to hear.]
>>
>> One aspect of RV is that we verify the actual execution of the system instead of
>> a complete model of the system, so we depend of the testing to cover all the
>> aspects of the system <-> model.
>>
>> There is a natural relation with testing/fuzzing & friends with RV.
>>
>>> Looks like there is ongoing work on specifying models and running them
>>> along with the kernel: https://lwn.net/Articles/857862/
>>>
>>> Those models that are run alongside the kernel would have their own
>>> coverage, and since there's a mapping between real code and model, a
>>> fuzzer trying to reach new code in one or the other will ultimately
>>> improve coverage for both.
>>
>> Perfect!
>>
>>> Just wanted to document this here, because it seems quite relevant.
>>> I'm guessing that "functional coverage" would indeed be a side-effect
>>> of a good RV model?
>>
>> So, let me see if I understood the terms. Functional coverage is a way to check
>> if all the desired aspects of a code/system/subsystem/functionality were covered
>> by a set of tests?
>
> Yes, unlike code/structural coverage (which is what we have today via
> KCOV) functional coverage checks if some interesting states were reached
> (e.g. was buffer full/empty, did we observe transition a->b etc.).

So you want to observe a given a->b transition, not that B was visited?

> Functional coverage is common in hardware verification, but of course
> software verification would benefit just as much -- just haven't seen it
> used much in practice yet.
> [ Example for HW verification: https://www.chipverify.com/systemverilog/systemverilog-functional-coverage ]
>
> It still requires some creativity from the designer/developer to come up
> with suitable functional coverage.

That is where the fun lives.

State explosion is a problem, too,
> and naturally it is impractical to capture all possible states ... after
> all, functional coverage is meant to direct the test generator/fuzzer
> into more interesting states -- we're not doing model checking after all.


I still need to understand what you are aiming to verify, and what is the
approach that you would like to use to express the specifications of the systems...

Can you give me a simple example?

>> If that is correct, we could use RV to:
>>
>> - create an explicit model of the states we want to cover.
>> - check if all the desired states were visited during testing.
>>
>> ?
>
> Yes, pretty much. On one hand there could be an interface to query if
> all states were covered, but I think this isn't useful out-of-the box.
> Instead, I was thinking we can simply get KCOV to help us out: my
> hypothesis is that most of this would happen automatically if dot2k's
> generated code has distinct code paths per transition.

...

>
> If KCOV covers the RV model (since it's executable kernel C code), then
> having distinct code paths for "state transitions" will effectively give
> us functional coverage indirectly through code coverage (via KCOV) of
> the RV model.

so, you want to have a different function for every transition so KCOV can
observe that?

>
> From what I can tell this doesn't quite happen today, because
> automaton::function is a lookup table as an array.

It is a the transition function of the formal automaton definition. Check this:

https://bristot.me/wp-content/uploads/2020/01/JSA_preprint.pdf

page 9.

Could this just
> become a generated function with a switch statement? Because then I
> think we'd pretty much have all the ingredients we need.

a switch statement that would.... call a different function for each transition?

> Then:
>
> 1. Create RV models for states of interests not covered by normal code
> coverage of code under test.
>
> 2. Enable KCOV for everything.
>
> 3. KCOV's coverage of the RV model will tell us if we reached the
> desired "functional coverage" (and can be used by e.g. syzbot to
> generate better tests without any additional changes because it
> already talks to KCOV).
>
> Thoughts?
>
> Thanks,
> -- Marco
>

Subject: Re: Functional Coverage via RV? (was: "Learning-based Controlled Concurrency Testing")

On 6/19/21 1:08 PM, Dmitry Vyukov wrote:
> On Fri, Jun 18, 2021 at 1:26 PM Marco Elver <[email protected]> wrote:
>>
>> On Fri, Jun 18, 2021 at 09:58AM +0200, Daniel Bristot de Oliveira wrote:
>>> On 6/17/21 1:20 PM, Marco Elver wrote:
>>>> [+Daniel, just FYI. We had a discussion about "functional coverage"
>>>> and fuzzing, and I've just seen your wonderful work on RV. If you have
>>>> thought about fuzzing with RV and how coverage of the model impacts
>>>> test generation, I'd be curious to hear.]
>>>
>>> One aspect of RV is that we verify the actual execution of the system instead of
>>> a complete model of the system, so we depend of the testing to cover all the
>>> aspects of the system <-> model.
>>>
>>> There is a natural relation with testing/fuzzing & friends with RV.
>>>
>>>> Looks like there is ongoing work on specifying models and running them
>>>> along with the kernel: https://lwn.net/Articles/857862/
>>>>
>>>> Those models that are run alongside the kernel would have their own
>>>> coverage, and since there's a mapping between real code and model, a
>>>> fuzzer trying to reach new code in one or the other will ultimately
>>>> improve coverage for both.
>>>
>>> Perfect!
>>>
>>>> Just wanted to document this here, because it seems quite relevant.
>>>> I'm guessing that "functional coverage" would indeed be a side-effect
>>>> of a good RV model?
>>>
>>> So, let me see if I understood the terms. Functional coverage is a way to check
>>> if all the desired aspects of a code/system/subsystem/functionality were covered
>>> by a set of tests?
>>
>> Yes, unlike code/structural coverage (which is what we have today via
>> KCOV) functional coverage checks if some interesting states were reached
>> (e.g. was buffer full/empty, did we observe transition a->b etc.).
>>
>> Functional coverage is common in hardware verification, but of course
>> software verification would benefit just as much -- just haven't seen it
>> used much in practice yet.
>> [ Example for HW verification: https://www.chipverify.com/systemverilog/systemverilog-functional-coverage ]
>>
>> It still requires some creativity from the designer/developer to come up
>> with suitable functional coverage. State explosion is a problem, too,
>> and naturally it is impractical to capture all possible states ... after
>> all, functional coverage is meant to direct the test generator/fuzzer
>> into more interesting states -- we're not doing model checking after all.
>>
>>> If that is correct, we could use RV to:
>>>
>>> - create an explicit model of the states we want to cover.
>>> - check if all the desired states were visited during testing.
>>>
>>> ?
>>
>> Yes, pretty much. On one hand there could be an interface to query if
>> all states were covered, but I think this isn't useful out-of-the box.
>> Instead, I was thinking we can simply get KCOV to help us out: my
>> hypothesis is that most of this would happen automatically if dot2k's
>> generated code has distinct code paths per transition.
>>
>> If KCOV covers the RV model (since it's executable kernel C code), then
>> having distinct code paths for "state transitions" will effectively give
>> us functional coverage indirectly through code coverage (via KCOV) of
>> the RV model.
>>
>> From what I can tell this doesn't quite happen today, because
>> automaton::function is a lookup table as an array. Could this just
>> become a generated function with a switch statement? Because then I
>> think we'd pretty much have all the ingredients we need.
>>
>> Then:
>>
>> 1. Create RV models for states of interests not covered by normal code
>> coverage of code under test.
>>
>> 2. Enable KCOV for everything.
>>
>> 3. KCOV's coverage of the RV model will tell us if we reached the
>> desired "functional coverage" (and can be used by e.g. syzbot to
>> generate better tests without any additional changes because it
>> already talks to KCOV).
>>
>> Thoughts?
>
> I think there is usually already some code for any important state
> transitions. E.g. I can't imagine how a socket can transition to
> active/listen/shutdown/closed states w/o any code.

makes sense...

> I see RV to be potentially more useful for the "coverage dimensions"
> idea. I.e. for sockets that would be treating coverage for a socket
> function X as different coverage based on the current socket state,
> effectively consider (PC,state) as feedback signal.

How can RV subsystem talk with KCOV?

> But my concern is that we don't want to simply consider combinations
> of all kernel code multiplied by all combinations of states of all RV
> models.

I agree! Also because RV monitors will generally monitor an specific part of the
code (with exceptions for models like the preemption one).

Most likely this will lead to severe feedback signal
> explosion.So the question is: how do we understand that the socket
> model relates only to this restricted set of code?
>

Should we annotate a model, saying which subsystem it monitors/verify?

-- Daniel

2021-06-21 10:31:42

by Marco Elver

[permalink] [raw]
Subject: Re: Functional Coverage via RV? (was: "Learning-based Controlled Concurrency Testing")

On Mon, Jun 21, 2021 at 10:23AM +0200, Daniel Bristot de Oliveira wrote:
[...]
> > Yes, unlike code/structural coverage (which is what we have today via
> > KCOV) functional coverage checks if some interesting states were reached
> > (e.g. was buffer full/empty, did we observe transition a->b etc.).
>
> So you want to observe a given a->b transition, not that B was visited?

An a->b transition would imply that a and b were visited.

> I still need to understand what you are aiming to verify, and what is the
> approach that you would like to use to express the specifications of the systems...
>
> Can you give me a simple example?

The older discussion started around a discussion how to get the fuzzer
into more interesting states in complex concurrent algorithms. But
otherwise I have no idea ... we were just brainstorming and got to the
point where it looked like "functional coverage" would improve automated
test generation in general. And then I found RV which pretty much can
specify "functional coverage" and almost gets that information to KCOV
"for free".

> so, you want to have a different function for every transition so KCOV can
> observe that?

Not a different function, just distinct "basic blocks". KCOV uses
compiler instrumentation, and a sequence of non-branching instructions
denote one point of coverage; at the next branch (conditional or otherwise)
it then records which branch was taken and therefore we know which code
paths were covered.

> >
> > From what I can tell this doesn't quite happen today, because
> > automaton::function is a lookup table as an array.
>
> It is a the transition function of the formal automaton definition. Check this:
>
> https://bristot.me/wp-content/uploads/2020/01/JSA_preprint.pdf
>
> page 9.
>
> Could this just
> > become a generated function with a switch statement? Because then I
> > think we'd pretty much have all the ingredients we need.
>
> a switch statement that would.... call a different function for each transition?

No, just a switch statement that returns the same thing as it does
today. But KCOV wouldn't see different different coverage with the
current version because it's all in one basic block because it looks up
the next state given the current state out of the array. If it was a
switch statement doing the same thing, the compiler will turn the thing
into conditional branches and KCOV then knows which code path
(effectively the transition) was covered.

> > Then:
> >
> > 1. Create RV models for states of interests not covered by normal code
> > coverage of code under test.
> >
> > 2. Enable KCOV for everything.
> >
> > 3. KCOV's coverage of the RV model will tell us if we reached the
> > desired "functional coverage" (and can be used by e.g. syzbot to
> > generate better tests without any additional changes because it
> > already talks to KCOV).
> >
> > Thoughts?
> >
> > Thanks,
> > -- Marco

Subject: Re: Functional Coverage via RV? (was: "Learning-based Controlled Concurrency Testing")

On 6/21/21 12:30 PM, Marco Elver wrote:
> On Mon, Jun 21, 2021 at 10:23AM +0200, Daniel Bristot de Oliveira wrote:
> [...]
>>> Yes, unlike code/structural coverage (which is what we have today via
>>> KCOV) functional coverage checks if some interesting states were reached
>>> (e.g. was buffer full/empty, did we observe transition a->b etc.).
>>
>> So you want to observe a given a->b transition, not that B was visited?
>
> An a->b transition would imply that a and b were visited.

HA! let's try again with a less abstract example...


| +------------ on --+----------------+
v ^ +--------v v
+========+ | +===========+>--- suspend ---->+===========+
| OFF | +- on --<| ON | | SUSPENDED |
+========+ <------ shutdown -----<+===========+<----- on -------<+===========+
^ v v
+--------------- off ----------------+-----------------------------+

Do you care about:

1) states [OFF|ON|SUSPENDED] being visited a # of times; or
2) the occurrence of the [on|suspend|off] events a # of times; or
3) the language generated by the "state machine"; like:

the occurrence of *"on -> suspend -> on -> off"*

which is != of

the occurrence of *"on -> on -> suspend -> off"*

although the same events and states occurred the same # of times
?

RV can give you all... but the way to inform this might be different.

>> I still need to understand what you are aiming to verify, and what is the
>> approach that you would like to use to express the specifications of the systems...
>>
>> Can you give me a simple example?
>
> The older discussion started around a discussion how to get the fuzzer
> into more interesting states in complex concurrent algorithms. But
> otherwise I have no idea ... we were just brainstorming and got to the
> point where it looked like "functional coverage" would improve automated
> test generation in general. And then I found RV which pretty much can
> specify "functional coverage" and almost gets that information to KCOV
> "for free".

I think we will end up having an almost for free solution, but worth the price.

>> so, you want to have a different function for every transition so KCOV can
>> observe that?
>
> Not a different function, just distinct "basic blocks". KCOV uses
> compiler instrumentation, and a sequence of non-branching instructions
> denote one point of coverage; at the next branch (conditional or otherwise)
> it then records which branch was taken and therefore we know which code
> paths were covered.

ah, got it. But can't KCOV be extended with another source of information?

>>>
>>> From what I can tell this doesn't quite happen today, because
>>> automaton::function is a lookup table as an array.
>>
>> It is a the transition function of the formal automaton definition. Check this:
>>
>> https://bristot.me/wp-content/uploads/2020/01/JSA_preprint.pdf
>>
>> page 9.
>>
>> Could this just
>>> become a generated function with a switch statement? Because then I
>>> think we'd pretty much have all the ingredients we need.
>>
>> a switch statement that would.... call a different function for each transition?
>
> No, just a switch statement that returns the same thing as it does
> today. But KCOV wouldn't see different different coverage with the
> current version because it's all in one basic block because it looks up
> the next state given the current state out of the array. If it was a
> switch statement doing the same thing, the compiler will turn the thing
> into conditional branches and KCOV then knows which code path
> (effectively the transition) was covered.

[ the answer for this points will depend on your answer from my first question
on this email so... I will reply it later ].

-- Daniel

>>> Then:
>>>
>>> 1. Create RV models for states of interests not covered by normal code
>>> coverage of code under test.
>>>
>>> 2. Enable KCOV for everything.
>>>
>>> 3. KCOV's coverage of the RV model will tell us if we reached the
>>> desired "functional coverage" (and can be used by e.g. syzbot to
>>> generate better tests without any additional changes because it
>>> already talks to KCOV).
>>>
>>> Thoughts?
>>>
>>> Thanks,
>>> -- Marco
>

2021-06-22 05:18:38

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: Functional Coverage via RV? (was: "Learning-based Controlled Concurrency Testing")

On Mon, Jun 21, 2021 at 12:30 PM Marco Elver <[email protected]> wrote:
>
> On Mon, Jun 21, 2021 at 10:23AM +0200, Daniel Bristot de Oliveira wrote:
> [...]
> > > Yes, unlike code/structural coverage (which is what we have today via
> > > KCOV) functional coverage checks if some interesting states were reached
> > > (e.g. was buffer full/empty, did we observe transition a->b etc.).
> >
> > So you want to observe a given a->b transition, not that B was visited?
>
> An a->b transition would imply that a and b were visited.
>
> > I still need to understand what you are aiming to verify, and what is the
> > approach that you would like to use to express the specifications of the systems...
> >
> > Can you give me a simple example?
>
> The older discussion started around a discussion how to get the fuzzer
> into more interesting states in complex concurrent algorithms. But
> otherwise I have no idea ... we were just brainstorming and got to the
> point where it looked like "functional coverage" would improve automated
> test generation in general. And then I found RV which pretty much can
> specify "functional coverage" and almost gets that information to KCOV
> "for free".
>
> > so, you want to have a different function for every transition so KCOV can
> > observe that?
>
> Not a different function, just distinct "basic blocks". KCOV uses
> compiler instrumentation, and a sequence of non-branching instructions
> denote one point of coverage; at the next branch (conditional or otherwise)
> it then records which branch was taken and therefore we know which code
> paths were covered.
>
> > >
> > > From what I can tell this doesn't quite happen today, because
> > > automaton::function is a lookup table as an array.
> >
> > It is a the transition function of the formal automaton definition. Check this:
> >
> > https://bristot.me/wp-content/uploads/2020/01/JSA_preprint.pdf
> >
> > page 9.
> >
> > Could this just
> > > become a generated function with a switch statement? Because then I
> > > think we'd pretty much have all the ingredients we need.
> >
> > a switch statement that would.... call a different function for each transition?
>
> No, just a switch statement that returns the same thing as it does
> today. But KCOV wouldn't see different different coverage with the
> current version because it's all in one basic block because it looks up
> the next state given the current state out of the array. If it was a
> switch statement doing the same thing, the compiler will turn the thing
> into conditional branches and KCOV then knows which code path
> (effectively the transition) was covered.

If we do this, we need to watch out for compiler optimizations. In
both clang and gcc KCOV pass runs in the middle of the middle-end
after some optimizations. It's possible that some trivial branches are
merged back into unconditional code already (e.g. table/conditional
moves).


> > > Then:
> > >
> > > 1. Create RV models for states of interests not covered by normal code
> > > coverage of code under test.
> > >
> > > 2. Enable KCOV for everything.
> > >
> > > 3. KCOV's coverage of the RV model will tell us if we reached the
> > > desired "functional coverage" (and can be used by e.g. syzbot to
> > > generate better tests without any additional changes because it
> > > already talks to KCOV).
> > >
> > > Thoughts?
> > >
> > > Thanks,
> > > -- Marco

2021-06-22 06:33:55

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: Functional Coverage via RV? (was: "Learning-based Controlled Concurrency Testing")

On Mon, Jun 21, 2021 at 10:39 AM Daniel Bristot de Oliveira
<[email protected]> wrote:
>
> On 6/19/21 1:08 PM, Dmitry Vyukov wrote:
> > On Fri, Jun 18, 2021 at 1:26 PM Marco Elver <[email protected]> wrote:
> >>
> >> On Fri, Jun 18, 2021 at 09:58AM +0200, Daniel Bristot de Oliveira wrote:
> >>> On 6/17/21 1:20 PM, Marco Elver wrote:
> >>>> [+Daniel, just FYI. We had a discussion about "functional coverage"
> >>>> and fuzzing, and I've just seen your wonderful work on RV. If you have
> >>>> thought about fuzzing with RV and how coverage of the model impacts
> >>>> test generation, I'd be curious to hear.]
> >>>
> >>> One aspect of RV is that we verify the actual execution of the system instead of
> >>> a complete model of the system, so we depend of the testing to cover all the
> >>> aspects of the system <-> model.
> >>>
> >>> There is a natural relation with testing/fuzzing & friends with RV.
> >>>
> >>>> Looks like there is ongoing work on specifying models and running them
> >>>> along with the kernel: https://lwn.net/Articles/857862/
> >>>>
> >>>> Those models that are run alongside the kernel would have their own
> >>>> coverage, and since there's a mapping between real code and model, a
> >>>> fuzzer trying to reach new code in one or the other will ultimately
> >>>> improve coverage for both.
> >>>
> >>> Perfect!
> >>>
> >>>> Just wanted to document this here, because it seems quite relevant.
> >>>> I'm guessing that "functional coverage" would indeed be a side-effect
> >>>> of a good RV model?
> >>>
> >>> So, let me see if I understood the terms. Functional coverage is a way to check
> >>> if all the desired aspects of a code/system/subsystem/functionality were covered
> >>> by a set of tests?
> >>
> >> Yes, unlike code/structural coverage (which is what we have today via
> >> KCOV) functional coverage checks if some interesting states were reached
> >> (e.g. was buffer full/empty, did we observe transition a->b etc.).
> >>
> >> Functional coverage is common in hardware verification, but of course
> >> software verification would benefit just as much -- just haven't seen it
> >> used much in practice yet.
> >> [ Example for HW verification: https://www.chipverify.com/systemverilog/systemverilog-functional-coverage ]
> >>
> >> It still requires some creativity from the designer/developer to come up
> >> with suitable functional coverage. State explosion is a problem, too,
> >> and naturally it is impractical to capture all possible states ... after
> >> all, functional coverage is meant to direct the test generator/fuzzer
> >> into more interesting states -- we're not doing model checking after all.
> >>
> >>> If that is correct, we could use RV to:
> >>>
> >>> - create an explicit model of the states we want to cover.
> >>> - check if all the desired states were visited during testing.
> >>>
> >>> ?
> >>
> >> Yes, pretty much. On one hand there could be an interface to query if
> >> all states were covered, but I think this isn't useful out-of-the box.
> >> Instead, I was thinking we can simply get KCOV to help us out: my
> >> hypothesis is that most of this would happen automatically if dot2k's
> >> generated code has distinct code paths per transition.
> >>
> >> If KCOV covers the RV model (since it's executable kernel C code), then
> >> having distinct code paths for "state transitions" will effectively give
> >> us functional coverage indirectly through code coverage (via KCOV) of
> >> the RV model.
> >>
> >> From what I can tell this doesn't quite happen today, because
> >> automaton::function is a lookup table as an array. Could this just
> >> become a generated function with a switch statement? Because then I
> >> think we'd pretty much have all the ingredients we need.
> >>
> >> Then:
> >>
> >> 1. Create RV models for states of interests not covered by normal code
> >> coverage of code under test.
> >>
> >> 2. Enable KCOV for everything.
> >>
> >> 3. KCOV's coverage of the RV model will tell us if we reached the
> >> desired "functional coverage" (and can be used by e.g. syzbot to
> >> generate better tests without any additional changes because it
> >> already talks to KCOV).
> >>
> >> Thoughts?
> >
> > I think there is usually already some code for any important state
> > transitions. E.g. I can't imagine how a socket can transition to
> > active/listen/shutdown/closed states w/o any code.
>
> makes sense...
>
> > I see RV to be potentially more useful for the "coverage dimensions"
> > idea. I.e. for sockets that would be treating coverage for a socket
> > function X as different coverage based on the current socket state,
> > effectively consider (PC,state) as feedback signal.
>
> How can RV subsystem talk with KCOV?

KCOV collects a trace of covered PCs. One natural way for this
interface would be a callback that allows injecting RV state events
into the KCOV trace. To make it possible to associate states with
code, these events need to be scoped, e.g.:

void kcov_state_start(int model, int state);
void kcov_state_end(int model, int state);

There is no prior art that I am aware of, so I assume it will require
some experimentation and research work to figure out exactly what
interface works best, if it works at all, how much it helps fuzzing,
is it a good metric for assessing testing coverage, etc.

> > But my concern is that we don't want to simply consider combinations
> > of all kernel code multiplied by all combinations of states of all RV
> > models.
>
> I agree! Also because RV monitors will generally monitor an specific part of the
> code (with exceptions for models like the preemption one).
>
> Most likely this will lead to severe feedback signal
> > explosion.So the question is: how do we understand that the socket
> > model relates only to this restricted set of code?
> >
> Should we annotate a model, saying which subsystem it monitors/verify?

Yes. The main question I see: how to specify what "subsystem" is.

Besides dynamic scoping we could use static mapping of models to code.
E.g. socket model covers net/core/*.c and net/tpc/*.c. Then maybe we
don't need dynamic scopes (?) however then it becomes tricker for
models that are associated with objects. Namely, if we traced
different states for different objects, what object does current
executions belong to? Does it belong to any of these at all?

2021-06-22 10:50:44

by Marco Elver

[permalink] [raw]
Subject: Re: Functional Coverage via RV? (was: "Learning-based Controlled Concurrency Testing")

On Mon, Jun 21, 2021 at 09:25PM +0200, Daniel Bristot de Oliveira wrote:
> On 6/21/21 12:30 PM, Marco Elver wrote:
> > On Mon, Jun 21, 2021 at 10:23AM +0200, Daniel Bristot de Oliveira wrote:
> > [...]
> >>> Yes, unlike code/structural coverage (which is what we have today via
> >>> KCOV) functional coverage checks if some interesting states were reached
> >>> (e.g. was buffer full/empty, did we observe transition a->b etc.).
> >>
> >> So you want to observe a given a->b transition, not that B was visited?
> >
> > An a->b transition would imply that a and b were visited.
>
> HA! let's try again with a less abstract example...

Terminology misunderstanding.

I mean "state transition". Writing "a->b transition" led me to infer 'a'
and 'b' are states, but from below I infer that you meant an "event
trace" (viz. event sequence). So it seems I was wrong.

Let me be clearer: transition A -[a]-> B implies states A and B were
visited. Hence, knowing that event 'a' occurred is sufficient, and
actually provides a little more information than just "A and B were
visited".

>
> | +------------ on --+----------------+
> v ^ +--------v v
> +========+ | +===========+>--- suspend ---->+===========+
> | OFF | +- on --<| ON | | SUSPENDED |
> +========+ <------ shutdown -----<+===========+<----- on -------<+===========+
> ^ v v
> +--------------- off ----------------+-----------------------------+
>
> Do you care about:
>
> 1) states [OFF|ON|SUSPENDED] being visited a # of times; or
> 2) the occurrence of the [on|suspend|off] events a # of times; or
> 3) the language generated by the "state machine"; like:
>
> the occurrence of *"on -> suspend -> on -> off"*
>
> which is != of
>
> the occurrence of *"on -> on -> suspend -> off"*
>
> although the same events and states occurred the same # of times
> ?

They are all interesting, but unrealistic for a fuzzer to keep track of.
We can't realistically keep track of all possible event traces. Nor that
some state or event was visited # of times.

What I did mean is as described above: the simple occurrence of an
event, as it implies some previous and next state were visited.

The fuzzer then builds up knowledge of which inputs cause some events to
occur. Because it knows it has inputs for such events, it will then try
to further combine these inputs hoping to reach new coverage. This leads
to various distinct event traces using the events it has already
observed. All of this is somewhat random of course, because fuzzers are
not meant to be model checkers.

If someone wants something more complex as you describe, it'd have to
explicitly become part of the model (if possible?). The problem of
coverage explosion applies, and we may not recommend such usage anyway.

> RV can give you all... but the way to inform this might be different.
>
> >> I still need to understand what you are aiming to verify, and what is the
> >> approach that you would like to use to express the specifications of the systems...
> >>
> >> Can you give me a simple example?
> >
> > The older discussion started around a discussion how to get the fuzzer
> > into more interesting states in complex concurrent algorithms. But
> > otherwise I have no idea ... we were just brainstorming and got to the
> > point where it looked like "functional coverage" would improve automated
> > test generation in general. And then I found RV which pretty much can
> > specify "functional coverage" and almost gets that information to KCOV
> > "for free".
>
> I think we will end up having an almost for free solution, but worth the price.
>
> >> so, you want to have a different function for every transition so KCOV can
> >> observe that?
> >
> > Not a different function, just distinct "basic blocks". KCOV uses
> > compiler instrumentation, and a sequence of non-branching instructions
> > denote one point of coverage; at the next branch (conditional or otherwise)
> > it then records which branch was taken and therefore we know which code
> > paths were covered.
>
> ah, got it. But can't KCOV be extended with another source of information?

Not without changing KCOV. And I think we're weary of something like
that due to the potential for coverage explosion. -fsanitize-coverage
has various options to capture different types of coverage actually, not
purely basic block based coverage. (KCOV already supports
KCOV_ENABLE_COMPARISONS, perhaps that could help somehow. It captures
arguments of comparisons.)

> >>>
> >>> From what I can tell this doesn't quite happen today, because
> >>> automaton::function is a lookup table as an array.
> >>
> >> It is a the transition function of the formal automaton definition. Check this:
> >>
> >> https://bristot.me/wp-content/uploads/2020/01/JSA_preprint.pdf
> >>
> >> page 9.
> >>
> >> Could this just
> >>> become a generated function with a switch statement? Because then I
> >>> think we'd pretty much have all the ingredients we need.
> >>
> >> a switch statement that would.... call a different function for each transition?
> >
> > No, just a switch statement that returns the same thing as it does
> > today. But KCOV wouldn't see different different coverage with the
> > current version because it's all in one basic block because it looks up
> > the next state given the current state out of the array. If it was a
> > switch statement doing the same thing, the compiler will turn the thing
> > into conditional branches and KCOV then knows which code path
> > (effectively the transition) was covered.

Per Dmitry's comment, yes we need to be careful that the compiler
doesn't collapse the switch statement somehow. But this should be
achievable with a bunch or 'barrier()' after every 'case ...:'.

> [ the answer for this points will depend on your answer from my first question
> on this email so... I will reply it later ].
>
> -- Daniel
>
> >>> Then:
> >>>
> >>> 1. Create RV models for states of interests not covered by normal code
> >>> coverage of code under test.
> >>>
> >>> 2. Enable KCOV for everything.
> >>>
> >>> 3. KCOV's coverage of the RV model will tell us if we reached the
> >>> desired "functional coverage" (and can be used by e.g. syzbot to
> >>> generate better tests without any additional changes because it
> >>> already talks to KCOV).
> >>>
> >>> Thoughts?
> >>>
> >>> Thanks,
> >>> -- Marco

Subject: Re: Functional Coverage via RV? (was: "Learning-based Controlled Concurrency Testing")

On 6/22/21 12:48 PM, Marco Elver wrote:
> On Mon, Jun 21, 2021 at 09:25PM +0200, Daniel Bristot de Oliveira wrote:
>> On 6/21/21 12:30 PM, Marco Elver wrote:
>>> On Mon, Jun 21, 2021 at 10:23AM +0200, Daniel Bristot de Oliveira wrote:
>>> [...]
>>>>> Yes, unlike code/structural coverage (which is what we have today via
>>>>> KCOV) functional coverage checks if some interesting states were reached
>>>>> (e.g. was buffer full/empty, did we observe transition a->b etc.).
>>>>
>>>> So you want to observe a given a->b transition, not that B was visited?
>>>
>>> An a->b transition would imply that a and b were visited.
>>
>> HA! let's try again with a less abstract example...
>
> Terminology misunderstanding.
>
> I mean "state transition". Writing "a->b transition" led me to infer 'a'
> and 'b' are states, but from below I infer that you meant an "event
> trace" (viz. event sequence). So it seems I was wrong.
>
> Let me be clearer: transition A -[a]-> B implies states A and B were
> visited.

right

Hence, knowing that event 'a' occurred is sufficient, and
> actually provides a little more information than just "A and B were
> visited".

iff [a] happens only from A to B...

>
>>
>> | +------------ on --+----------------+
>> v ^ +--------v v
>> +========+ | +===========+>--- suspend ---->+===========+
>> | OFF | +- on --<| ON | | SUSPENDED |
>> +========+ <------ shutdown -----<+===========+<----- on -------<+===========+
>> ^ v v
>> +--------------- off ----------------+-----------------------------+
>>
>> Do you care about:
>>
>> 1) states [OFF|ON|SUSPENDED] being visited a # of times; or
>> 2) the occurrence of the [on|suspend|off] events a # of times; or
>> 3) the language generated by the "state machine"; like:
>>
>> the occurrence of *"on -> suspend -> on -> off"*
>>
>> which is != of
>>
>> the occurrence of *"on -> on -> suspend -> off"*
>>
>> although the same events and states occurred the same # of times
>> ?
>
> They are all interesting, but unrealistic for a fuzzer to keep track of.
> We can't realistically keep track of all possible event traces. Nor that
> some state or event was visited # of times.

We can track this easily via RV, and doing that is already on my todo list. But
now I got that we do not need all these information for the functional coverage.

> What I did mean is as described above: the simple occurrence of an
> event, as it implies some previous and next state were visited.
>
> The fuzzer then builds up knowledge of which inputs cause some events to
> occur. Because it knows it has inputs for such events, it will then try
> to further combine these inputs hoping to reach new coverage. This leads
> to various distinct event traces using the events it has already
> observed. All of this is somewhat random of course, because fuzzers are
> not meant to be model checkers.
>
> If someone wants something more complex as you describe, it'd have to
> explicitly become part of the model (if possible?). The problem of
> coverage explosion applies, and we may not recommend such usage anyway.

I did not mean to make GCOV/the fuzzer to keep track of these information. I was
trying to understand what are the best way to provide the information that you
all need.

>> RV can give you all... but the way to inform this might be different.
>>
>>>> I still need to understand what you are aiming to verify, and what is the
>>>> approach that you would like to use to express the specifications of the systems...
>>>>
>>>> Can you give me a simple example?
>>>
>>> The older discussion started around a discussion how to get the fuzzer
>>> into more interesting states in complex concurrent algorithms. But
>>> otherwise I have no idea ... we were just brainstorming and got to the
>>> point where it looked like "functional coverage" would improve automated
>>> test generation in general. And then I found RV which pretty much can
>>> specify "functional coverage" and almost gets that information to KCOV
>>> "for free".
>>
>> I think we will end up having an almost for free solution, but worth the price.
>>
>>>> so, you want to have a different function for every transition so KCOV can
>>>> observe that?
>>>
>>> Not a different function, just distinct "basic blocks". KCOV uses
>>> compiler instrumentation, and a sequence of non-branching instructions
>>> denote one point of coverage; at the next branch (conditional or otherwise)
>>> it then records which branch was taken and therefore we know which code
>>> paths were covered.
>>
>> ah, got it. But can't KCOV be extended with another source of information?
>
> Not without changing KCOV. And I think we're weary of something like
> that due to the potential for coverage explosion. -fsanitize-coverage
> has various options to capture different types of coverage actually, not
> purely basic block based coverage. (KCOV already supports
> KCOV_ENABLE_COMPARISONS, perhaps that could help somehow. It captures
> arguments of comparisons.)
>
>>>>>
>>>>> From what I can tell this doesn't quite happen today, because
>>>>> automaton::function is a lookup table as an array.
>>>>
>>>> It is a the transition function of the formal automaton definition. Check this:
>>>>
>>>> https://bristot.me/wp-content/uploads/2020/01/JSA_preprint.pdf
>>>>
>>>> page 9.
>>>>
>>>> Could this just
>>>>> become a generated function with a switch statement? Because then I
>>>>> think we'd pretty much have all the ingredients we need.
>>>>
>>>> a switch statement that would.... call a different function for each transition?
>>>
>>> No, just a switch statement that returns the same thing as it does
>>> today. But KCOV wouldn't see different different coverage with the
>>> current version because it's all in one basic block because it looks up
>>> the next state given the current state out of the array. If it was a
>>> switch statement doing the same thing, the compiler will turn the thing
>>> into conditional branches and KCOV then knows which code path
>>> (effectively the transition) was covered.
>
> Per Dmitry's comment, yes we need to be careful that the compiler
> doesn't collapse the switch statement somehow. But this should be
> achievable with a bunch or 'barrier()' after every 'case ...:'.

Changing the "function" will add some overhead for the runtime monitor use-case.
For example, for the safety-critical systems that will run with a monitor
enabled to detect a failure and react to it.

But! I can extend the idea of the reactor to receive the successful state
transitions or create the "observer" abstraction, to which we can attach a
generic that will make the switch statements. This function can be
auto-generated by dot2k as well...

This reactor/observer can be enabed/disabled so... we can add as much annotation
and barriers as we want.

Thoughts?

-- Daniel

2021-06-24 08:13:20

by Marco Elver

[permalink] [raw]
Subject: Re: Functional Coverage via RV? (was: "Learning-based Controlled Concurrency Testing")

On Wed, 23 Jun 2021 at 11:10, Daniel Bristot de Oliveira
<[email protected]> wrote:
> On 6/22/21 12:48 PM, Marco Elver wrote:
> > On Mon, Jun 21, 2021 at 09:25PM +0200, Daniel Bristot de Oliveira wrote:
> >> On 6/21/21 12:30 PM, Marco Elver wrote:
> >>> On Mon, Jun 21, 2021 at 10:23AM +0200, Daniel Bristot de Oliveira wrote:
> >>> [...]
> >>>>> Yes, unlike code/structural coverage (which is what we have today via
> >>>>> KCOV) functional coverage checks if some interesting states were reached
> >>>>> (e.g. was buffer full/empty, did we observe transition a->b etc.).
> >>>>
> >>>> So you want to observe a given a->b transition, not that B was visited?
> >>>
> >>> An a->b transition would imply that a and b were visited.
> >>
> >> HA! let's try again with a less abstract example...
> >
> > Terminology misunderstanding.
> >
> > I mean "state transition". Writing "a->b transition" led me to infer 'a'
> > and 'b' are states, but from below I infer that you meant an "event
> > trace" (viz. event sequence). So it seems I was wrong.
> >
> > Let me be clearer: transition A -[a]-> B implies states A and B were
> > visited.
>
> right
>
> Hence, knowing that event 'a' occurred is sufficient, and
> > actually provides a little more information than just "A and B were
> > visited".
>
> iff [a] happens only from A to B...
>
> >
> >>
> >> | +------------ on --+----------------+
> >> v ^ +--------v v
> >> +========+ | +===========+>--- suspend ---->+===========+
> >> | OFF | +- on --<| ON | | SUSPENDED |
> >> +========+ <------ shutdown -----<+===========+<----- on -------<+===========+
> >> ^ v v
> >> +--------------- off ----------------+-----------------------------+
> >>
> >> Do you care about:
> >>
> >> 1) states [OFF|ON|SUSPENDED] being visited a # of times; or
> >> 2) the occurrence of the [on|suspend|off] events a # of times; or
> >> 3) the language generated by the "state machine"; like:
> >>
> >> the occurrence of *"on -> suspend -> on -> off"*
> >>
> >> which is != of
> >>
> >> the occurrence of *"on -> on -> suspend -> off"*
> >>
> >> although the same events and states occurred the same # of times
> >> ?
> >
> > They are all interesting, but unrealistic for a fuzzer to keep track of.
> > We can't realistically keep track of all possible event traces. Nor that
> > some state or event was visited # of times.
>
> We can track this easily via RV, and doing that is already on my todo list. But
> now I got that we do not need all these information for the functional coverage.
>
> > What I did mean is as described above: the simple occurrence of an
> > event, as it implies some previous and next state were visited.
> >
> > The fuzzer then builds up knowledge of which inputs cause some events to
> > occur. Because it knows it has inputs for such events, it will then try
> > to further combine these inputs hoping to reach new coverage. This leads
> > to various distinct event traces using the events it has already
> > observed. All of this is somewhat random of course, because fuzzers are
> > not meant to be model checkers.
> >
> > If someone wants something more complex as you describe, it'd have to
> > explicitly become part of the model (if possible?). The problem of
> > coverage explosion applies, and we may not recommend such usage anyway.
>
> I did not mean to make GCOV/the fuzzer to keep track of these information. I was
> trying to understand what are the best way to provide the information that you
> all need.
>
> >> RV can give you all... but the way to inform this might be different.
> >>
> >>>> I still need to understand what you are aiming to verify, and what is the
> >>>> approach that you would like to use to express the specifications of the systems...
> >>>>
> >>>> Can you give me a simple example?
> >>>
> >>> The older discussion started around a discussion how to get the fuzzer
> >>> into more interesting states in complex concurrent algorithms. But
> >>> otherwise I have no idea ... we were just brainstorming and got to the
> >>> point where it looked like "functional coverage" would improve automated
> >>> test generation in general. And then I found RV which pretty much can
> >>> specify "functional coverage" and almost gets that information to KCOV
> >>> "for free".
> >>
> >> I think we will end up having an almost for free solution, but worth the price.
> >>
> >>>> so, you want to have a different function for every transition so KCOV can
> >>>> observe that?
> >>>
> >>> Not a different function, just distinct "basic blocks". KCOV uses
> >>> compiler instrumentation, and a sequence of non-branching instructions
> >>> denote one point of coverage; at the next branch (conditional or otherwise)
> >>> it then records which branch was taken and therefore we know which code
> >>> paths were covered.
> >>
> >> ah, got it. But can't KCOV be extended with another source of information?
> >
> > Not without changing KCOV. And I think we're weary of something like
> > that due to the potential for coverage explosion. -fsanitize-coverage
> > has various options to capture different types of coverage actually, not
> > purely basic block based coverage. (KCOV already supports
> > KCOV_ENABLE_COMPARISONS, perhaps that could help somehow. It captures
> > arguments of comparisons.)
> >
> >>>>>
> >>>>> From what I can tell this doesn't quite happen today, because
> >>>>> automaton::function is a lookup table as an array.
> >>>>
> >>>> It is a the transition function of the formal automaton definition. Check this:
> >>>>
> >>>> https://bristot.me/wp-content/uploads/2020/01/JSA_preprint.pdf
> >>>>
> >>>> page 9.
> >>>>
> >>>> Could this just
> >>>>> become a generated function with a switch statement? Because then I
> >>>>> think we'd pretty much have all the ingredients we need.
> >>>>
> >>>> a switch statement that would.... call a different function for each transition?
> >>>
> >>> No, just a switch statement that returns the same thing as it does
> >>> today. But KCOV wouldn't see different different coverage with the
> >>> current version because it's all in one basic block because it looks up
> >>> the next state given the current state out of the array. If it was a
> >>> switch statement doing the same thing, the compiler will turn the thing
> >>> into conditional branches and KCOV then knows which code path
> >>> (effectively the transition) was covered.
> >
> > Per Dmitry's comment, yes we need to be careful that the compiler
> > doesn't collapse the switch statement somehow. But this should be
> > achievable with a bunch or 'barrier()' after every 'case ...:'.
>
> Changing the "function" will add some overhead for the runtime monitor use-case.
> For example, for the safety-critical systems that will run with a monitor
> enabled to detect a failure and react to it.
>
> But! I can extend the idea of the reactor to receive the successful state
> transitions or create the "observer" abstraction, to which we can attach a
> generic that will make the switch statements. This function can be
> auto-generated by dot2k as well...
>
> This reactor/observer can be enabed/disabled so... we can add as much annotation
> and barriers as we want.
>
> Thoughts?

That sounds reasonable. Simply having an option (Kconfig would be
ideal) to enable the KCOV-friendly version of the transition function
is good enough for the fuzzer usecase. The kernels built for fuzzing
usually include lots of other debug options anyway, and aren't
production kernels.

Thanks,
-- Marco