2021-06-04 19:14:51

by Jay Foster

[permalink] [raw]
Subject: Bluez Socket File Descriptor Leak

I am experiencing an odd problem with PAN networking.  I have PAN
networking setup in NAP role.  Another machine makes a BNEP connection
and performs network activity, then disconnects the BNEP connection. 
This repeats periodically.  This works just fine, except after a while,
the BNEP connection fails with the following in the log.

May  4 13:08:02 (none) daemon.debug bluetoothd[1373]:
profiles/network/server.c:confirm_event() BNEP: incoming connect from
B8:27:EB:E5:35:9B
May  4 13:08:03 (none) daemon.err bluetoothd[1373]: Can't add bnep0 to
the bridge br1: Too many open files(24)

ls /proc/`pidof bluetoothd`/fd shows about 1000 open file descriptors
(sockets mostly).  This looks like some kind of resource (file
descriptor) leak.

Has anyone experienced this before?  I don't know if it is in the
bluetoothd application or one if the libraries (glib2, dbus) it links
with.  Happens with bluez 5.19 and 5.52.

Jay


2021-06-04 21:40:50

by Luiz Augusto von Dentz

[permalink] [raw]
Subject: Re: Bluez Socket File Descriptor Leak

Hi Jay,

On Fri, Jun 4, 2021 at 12:14 PM Jay Foster <[email protected]> wrote:
>
> I am experiencing an odd problem with PAN networking. I have PAN
> networking setup in NAP role. Another machine makes a BNEP connection
> and performs network activity, then disconnects the BNEP connection.
> This repeats periodically. This works just fine, except after a while,
> the BNEP connection fails with the following in the log.
>
> May 4 13:08:02 (none) daemon.debug bluetoothd[1373]:
> profiles/network/server.c:confirm_event() BNEP: incoming connect from
> B8:27:EB:E5:35:9B
> May 4 13:08:03 (none) daemon.err bluetoothd[1373]: Can't add bnep0 to
> the bridge br1: Too many open files(24)
>
> ls /proc/`pidof bluetoothd`/fd shows about 1000 open file descriptors
> (sockets mostly). This looks like some kind of resource (file
> descriptor) leak.
>
> Has anyone experienced this before? I don't know if it is in the
> bluetoothd application or one if the libraries (glib2, dbus) it links
> with. Happens with bluez 5.19 and 5.52.

That looks like the fd are not released (via close) after they are
attached to the bridge, you could in theory increase the number of fd
a process can have in the meantime but we will need to fix this
problem at some point so please have a issue created in github:

https://github.com/bluez/bluez/

--
Luiz Augusto von Dentz

2021-06-04 22:19:35

by Jay Foster

[permalink] [raw]
Subject: Re: Bluez Socket File Descriptor Leak


On 6/4/2021 2:39 PM, Luiz Augusto von Dentz wrote:
> Hi Jay,
>
> On Fri, Jun 4, 2021 at 12:14 PM Jay Foster <[email protected]> wrote:
>> I am experiencing an odd problem with PAN networking. I have PAN
>> networking setup in NAP role. Another machine makes a BNEP connection
>> and performs network activity, then disconnects the BNEP connection.
>> This repeats periodically. This works just fine, except after a while,
>> the BNEP connection fails with the following in the log.
>>
>> May 4 13:08:02 (none) daemon.debug bluetoothd[1373]:
>> profiles/network/server.c:confirm_event() BNEP: incoming connect from
>> B8:27:EB:E5:35:9B
>> May 4 13:08:03 (none) daemon.err bluetoothd[1373]: Can't add bnep0 to
>> the bridge br1: Too many open files(24)
>>
>> ls /proc/`pidof bluetoothd`/fd shows about 1000 open file descriptors
>> (sockets mostly). This looks like some kind of resource (file
>> descriptor) leak.
>>
>> Has anyone experienced this before? I don't know if it is in the
>> bluetoothd application or one if the libraries (glib2, dbus) it links
>> with. Happens with bluez 5.19 and 5.52.
> That looks like the fd are not released (via close) after they are
> attached to the bridge, you could in theory increase the number of fd
> a process can have in the meantime but we will need to fix this
> problem at some point so please have a issue created in github:
>
> https://github.com/bluez/bluez/
>
Using strace attached to bluetoothd during a BNEP disconnect/reconnect
sequence, it looks like the socket that the previous BNEP connection was
accepted on is not closed.  bluetoothd accepts the new connection on a
new socket (fd count goes up by one) but never closes the previous
connection socket.  This is unrelated to the bridge.  That just happens
to be the first function that tries to create a socket after the fd
limit is reached.
Increasing the fd limits for the process is not an option (It will
eventually fail).  This is on a resource limited embedded system.