2022-08-08 10:29:09

by Ondrej Mosnacek

[permalink] [raw]
Subject: Binder regression caused by commit a43cfc87caaf

Hello,

FYI, since commit a43cfc87caaf ("android: binder: stop saving a
pointer to the VMA") (found by git bisect) the binder test in
selinux-testsuite [1] started to trigger a lockdep assert BUG() in
find_vma() - see the end of [2] for an example.

A minimal reproducer is:
```
git clone https://github.com/SELinuxProject/selinux-testsuite.git
cd selinux-testsuite/tests/binder
make
setenforce 0 # if SELinux is enabled
./init_binder.sh || true
./manager -n -v & sleep 2
./service_provider -n -v
```
Requires the equivalent of libselinux-devel, make, gcc, and git-core
Fedora packages.
The last command will trigger the BUG; on good kernels it will
successfully enter the ioctl loop.

[1] https://github.com/SELinuxProject/selinux-testsuite/
[2] https://s3.us-east-1.amazonaws.com/arr-cki-prod-datawarehouse-public/datawarehouse-public/2022/08/07/redhat:606549366/build_x86_64_redhat:606549366_x86_64/tests/5/results_0001/console.log/console.log

--
Ondrej Mosnacek
Senior Software Engineer, Linux Security - SELinux kernel
Red Hat, Inc.


2022-08-08 15:16:09

by Liam R. Howlett

[permalink] [raw]
Subject: Re: Binder regression caused by commit a43cfc87caaf

* Ondrej Mosnacek <[email protected]> [220808 06:13]:
> Hello,
>
> FYI, since commit a43cfc87caaf ("android: binder: stop saving a
> pointer to the VMA") (found by git bisect) the binder test in
> selinux-testsuite [1] started to trigger a lockdep assert BUG() in
> find_vma() - see the end of [2] for an example.
>
> A minimal reproducer is:
> ```
> git clone https://github.com/SELinuxProject/selinux-testsuite.git
> cd selinux-testsuite/tests/binder
> make
> setenforce 0 # if SELinux is enabled
> ./init_binder.sh || true
> ./manager -n -v & sleep 2
> ./service_provider -n -v
> ```
> Requires the equivalent of libselinux-devel, make, gcc, and git-core
> Fedora packages.
> The last command will trigger the BUG; on good kernels it will
> successfully enter the ioctl loop.
>
> [1] https://github.com/SELinuxProject/selinux-testsuite/
> [2] https://s3.us-east-1.amazonaws.com/arr-cki-prod-datawarehouse-public/datawarehouse-public/2022/08/07/redhat:606549366/build_x86_64_redhat:606549366_x86_64/tests/5/results_0001/console.log/console.log
>

Thanks. It looks like binder has some paths that are not taking the
necessary mmap lock for using VMAs. I'm looking into it now.

Regards,
Liam

2022-08-08 20:00:59

by Liam R. Howlett

[permalink] [raw]
Subject: Re: Binder regression caused by commit a43cfc87caaf

* Liam R. Howlett <[email protected]> [220808 11:07]:
> * Ondrej Mosnacek <[email protected]> [220808 06:13]:
> > Hello,
> >
> > FYI, since commit a43cfc87caaf ("android: binder: stop saving a
> > pointer to the VMA") (found by git bisect) the binder test in
> > selinux-testsuite [1] started to trigger a lockdep assert BUG() in
> > find_vma() - see the end of [2] for an example.
> >
> > A minimal reproducer is:
> > ```
> > git clone https://github.com/SELinuxProject/selinux-testsuite.git
> > cd selinux-testsuite/tests/binder
> > make
> > setenforce 0 # if SELinux is enabled
> > ./init_binder.sh || true
> > ./manager -n -v & sleep 2
> > ./service_provider -n -v
> > ```
> > Requires the equivalent of libselinux-devel, make, gcc, and git-core
> > Fedora packages.
> > The last command will trigger the BUG; on good kernels it will
> > successfully enter the ioctl loop.
> >
> > [1] https://github.com/SELinuxProject/selinux-testsuite/
> > [2] https://s3.us-east-1.amazonaws.com/arr-cki-prod-datawarehouse-public/datawarehouse-public/2022/08/07/redhat:606549366/build_x86_64_redhat:606549366_x86_64/tests/5/results_0001/console.log/console.log
> >
>
> Thanks. It looks like binder has some paths that are not taking the
> necessary mmap lock for using VMAs. I'm looking into it now.

This does not fail for me, are you sure this is the reproducer? I see
the manager and service_provider communicate.

Looking at your trace and the code, the bug makes sense and I have
something that will probably fix the issue, but I'd like to verify.

Thanks,
Liam

2022-08-08 20:50:18

by Ondrej Mosnacek

[permalink] [raw]
Subject: Re: Binder regression caused by commit a43cfc87caaf

On Mon, Aug 8, 2022 at 9:52 PM Liam Howlett <[email protected]> wrote:
> * Liam R. Howlett <[email protected]> [220808 11:07]:
> > * Ondrej Mosnacek <[email protected]> [220808 06:13]:
> > > Hello,
> > >
> > > FYI, since commit a43cfc87caaf ("android: binder: stop saving a
> > > pointer to the VMA") (found by git bisect) the binder test in
> > > selinux-testsuite [1] started to trigger a lockdep assert BUG() in
> > > find_vma() - see the end of [2] for an example.
> > >
> > > A minimal reproducer is:
> > > ```
> > > git clone https://github.com/SELinuxProject/selinux-testsuite.git
> > > cd selinux-testsuite/tests/binder
> > > make
> > > setenforce 0 # if SELinux is enabled
> > > ./init_binder.sh || true
> > > ./manager -n -v & sleep 2
> > > ./service_provider -n -v
> > > ```
> > > Requires the equivalent of libselinux-devel, make, gcc, and git-core
> > > Fedora packages.
> > > The last command will trigger the BUG; on good kernels it will
> > > successfully enter the ioctl loop.
> > >
> > > [1] https://github.com/SELinuxProject/selinux-testsuite/
> > > [2] https://s3.us-east-1.amazonaws.com/arr-cki-prod-datawarehouse-public/datawarehouse-public/2022/08/07/redhat:606549366/build_x86_64_redhat:606549366_x86_64/tests/5/results_0001/console.log/console.log
> > >
> >
> > Thanks. It looks like binder has some paths that are not taking the
> > necessary mmap lock for using VMAs. I'm looking into it now.
>
> This does not fail for me, are you sure this is the reproducer? I see
> the manager and service_provider communicate.
>
> Looking at your trace and the code, the bug makes sense and I have
> something that will probably fix the issue, but I'd like to verify.

Hm... it seems it is necessary to have CONFIG_DEBUG_VM=y for the
particular debug check to be active. I guess you don't have that
turned on? It happens to be turned on in Fedora's release kernel, so I
didn't realize there was a specific config dependency.

Thank you for looking into it!

--
Ondrej Mosnacek
Senior Software Engineer, Linux Security - SELinux kernel
Red Hat, Inc.