Hi,
I've been staring at this commit for some time and I wonder what were the
symptoms when the issue was reproduced?
"The bug was discovered by manual code analysis and reproducible
only with explicit udelay() in lookup_elem_raw()."
I tried various stress test + timing combinations in lookup_elem_raw() but no
luck.
I believe that one of our production boxes ran into that issue lately with a GPF
in the area of htab_map_lookup_elem(). The crash was seen on an outdated
4.9 stable.
Please CC me as I'm not on the list.
thanks in advance,
Etienne
On Mon, Aug 30, 2021 at 7:17 AM Etienne Martineau <[email protected]> wrote:
>
> Hi,
>
> I've been staring at this commit for some time and I wonder what were the
> symptoms when the issue was reproduced?
> "The bug was discovered by manual code analysis and reproducible
> only with explicit udelay() in lookup_elem_raw()."
>
> I tried various stress test + timing combinations in lookup_elem_raw() but no
> luck.
That fix was a long time ago :)
afair the issue will not look like a crash, but rather an element
will not be found.
That's what lookup_nulls_elem_raw() is fixing.
> I believe that one of our production boxes ran into that issue lately with a GPF
> in the area of htab_map_lookup_elem(). The crash was seen on an outdated
> 4.9 stable.
Would be great if you can reproduce it on the latest kernel.
On Mon, Aug 30, 2021 at 12:39 PM Alexei Starovoitov
<[email protected]> wrote:
>
> On Mon, Aug 30, 2021 at 7:17 AM Etienne Martineau <[email protected]> wrote:
> >
> > Hi,
> >
> > I've been staring at this commit for some time and I wonder what were the
> > symptoms when the issue was reproduced?
> > "The bug was discovered by manual code analysis and reproducible
> > only with explicit udelay() in lookup_elem_raw()."
> >
> > I tried various stress test + timing combinations in lookup_elem_raw() but no
> > luck.
>
> That fix was a long time ago :)
> afair the issue will not look like a crash, but rather an element
> will not be found.
> That's what lookup_nulls_elem_raw() is fixing.
Under that same scenario I wonder if it's also possible to have a
messed up element somehow?
>
> > I believe that one of our production boxes ran into that issue lately with a GPF
> > in the area of htab_map_lookup_elem(). The crash was seen on an outdated
> > 4.9 stable.
>
> Would be great if you can reproduce it on the latest kernel.
We have another deployment on 5.4 stable running the same bpf code so
will let you know.