On Tue, May 10, 2022 at 03:53:32PM +0530, Charan Teja Kalla wrote:
> The dmabuf file uses get_next_ino()(through dma_buf_getfile() ->
> alloc_anon_inode()) to get an inode number and uses the same as a
> directory name under /sys/kernel/dmabuf/buffers/<ino>. This directory is
> used to collect the dmabuf stats and it is created through
> dma_buf_stats_setup(). At current, failure to create this directory
> entry can make the dma_buf_export() to fail.
>
> Now, as the get_next_ino() can definitely give a repetitive inode no
> causing the directory entry creation to fail with -EEXIST. This is a
> problem on the systems where dmabuf stats functionality is enabled on
> the production builds can make the dma_buf_export(), though the dmabuf
> memory is allocated successfully, to fail just because it couldn't
> create stats entry.
Then maybe we should not fail the creation path of the kobject fails to
be created? It's just for debugging, it should be fine if the creation
of it isn't there.
>
> This issue we are able to see on the snapdragon system within 13 days
> where there already exists a directory with inode no "122602" so
> dma_buf_stats_setup() failed with -EEXIST as it is trying to create
> the same directory entry.
>
> To make the directory entry as unique, append the inode creation time to
> the inode. With this change the stats directory entries will be in the
> format of: /sys/kernel/dmabuf/buffers/<inode no>-<inode creation time in
> secs>.
As you are changing the format here, shouldn't the Documentation/ABI/
entry for this also be changed?
And what's to keep the seconds field from also being the same?
thanks,
greg k-h
Thanks Greg for the inputs!!
On 5/10/2022 4:30 PM, Greg KH wrote:
>> The dmabuf file uses get_next_ino()(through dma_buf_getfile() ->
>> alloc_anon_inode()) to get an inode number and uses the same as a
>> directory name under /sys/kernel/dmabuf/buffers/<ino>. This directory is
>> used to collect the dmabuf stats and it is created through
>> dma_buf_stats_setup(). At current, failure to create this directory
>> entry can make the dma_buf_export() to fail.
>>
>> Now, as the get_next_ino() can definitely give a repetitive inode no
>> causing the directory entry creation to fail with -EEXIST. This is a
>> problem on the systems where dmabuf stats functionality is enabled on
>> the production builds can make the dma_buf_export(), though the dmabuf
>> memory is allocated successfully, to fail just because it couldn't
>> create stats entry.
> Then maybe we should not fail the creation path of the kobject fails to
> be created? It's just for debugging, it should be fine if the creation
> of it isn't there.
Not creating the debug node under some special cases can make this
interface not reliable if one wants to know info about the created
dmabuf buffers. Please help in correcting me If my perspective is wrong
here.
IIUC, except this -EEXIST condition, under the other conditions (-EINVAL
and -ENOMEM) failure is fine. Since, we are going to fix the -EEXIST
error in this patch, my opinion is failure in the kobject creation path
is acceptable for the reasons: a) The user is expected to pass the valid
dmabuf to create the stats node, b) The user can undefine the
CONFIG_DMABUF_SYSFS_STATS if he don't want this stats.
>
>> This issue we are able to see on the snapdragon system within 13 days
>> where there already exists a directory with inode no "122602" so
>> dma_buf_stats_setup() failed with -EEXIST as it is trying to create
>> the same directory entry.
>>
>> To make the directory entry as unique, append the inode creation time to
>> the inode. With this change the stats directory entries will be in the
>> format of: /sys/kernel/dmabuf/buffers/<inode no>-<inode creation time in
>> secs>.
> As you are changing the format here, shouldn't the Documentation/ABI/
> entry for this also be changed?
>
> And what's to keep the seconds field from also being the same?
get_next_ino() just increases the inode number monotonically and return
to the caller and it is 'unsigned int' data type. Thus 2 successive
calls always generate the different inode_number but can be the same
secs value. With inode-secs format, this will be still be a unique
string. Say it will be like ino1-sec1 and ino2-sec1.
Now after the inode number overflow and wraps, we may get the ino1 again
from the get_next_ino() but then secs will be different i.e. say it may
be like ino1-secn and ion2-secn. So, it always be a unique string.
IOW, with secs field added, to get the same inode-secs string, the uint
should overflow in the same second which is impossible.
Thanks for pointing out the changes to be done in ABI document. Will do
it in the next spin.
Am 10.05.22 um 13:00 schrieb Greg KH:
> On Tue, May 10, 2022 at 03:53:32PM +0530, Charan Teja Kalla wrote:
>> The dmabuf file uses get_next_ino()(through dma_buf_getfile() ->
>> alloc_anon_inode()) to get an inode number and uses the same as a
>> directory name under /sys/kernel/dmabuf/buffers/<ino>. This directory is
>> used to collect the dmabuf stats and it is created through
>> dma_buf_stats_setup(). At current, failure to create this directory
>> entry can make the dma_buf_export() to fail.
>>
>> Now, as the get_next_ino() can definitely give a repetitive inode no
>> causing the directory entry creation to fail with -EEXIST. This is a
>> problem on the systems where dmabuf stats functionality is enabled on
>> the production builds can make the dma_buf_export(), though the dmabuf
>> memory is allocated successfully, to fail just because it couldn't
>> create stats entry.
> Then maybe we should not fail the creation path of the kobject fails to
> be created? It's just for debugging, it should be fine if the creation
> of it isn't there.
Well if it's just for debugging then it should be under debugfs and not
sysfs.
>> This issue we are able to see on the snapdragon system within 13 days
>> where there already exists a directory with inode no "122602" so
>> dma_buf_stats_setup() failed with -EEXIST as it is trying to create
>> the same directory entry.
>>
>> To make the directory entry as unique, append the inode creation time to
>> the inode. With this change the stats directory entries will be in the
>> format of: /sys/kernel/dmabuf/buffers/<inode no>-<inode creation time in
>> secs>.
> As you are changing the format here, shouldn't the Documentation/ABI/
> entry for this also be changed?
As far as I can see that is even an UAPI break, not sure if we can allow
that.
> And what's to keep the seconds field from also being the same?
Well exporting two DMA-bufs with the same ino in the same nanosecond
should be basically impossible, but I would rather opt for using a 64bit
atomic in that function.
This should be 100% UAPI compatible and even if we manage to create one
buffer ever ns we need ~500years to wrap around.
Regards,
Christian.
>
> thanks,
>
> greg k-h
Thanks Christian for the inputs!!
On 5/10/2022 5:05 PM, Christian König wrote:
>
>> And what's to keep the seconds field from also being the same?
>
> Well exporting two DMA-bufs with the same ino in the same nanosecond
> should be basically impossible, but I would rather opt for using a 64bit
> atomic in that function.
>
> This should be 100% UAPI compatible and even if we manage to create one
> buffer ever ns we need ~500years to wrap around.
I see that the inode->i_ctime->tv_sec is already defined as
64bit(time64_t tv_sec), hence used it. This way we don't need extra
static atomic_t variable just to get a unique name.
Just pasting excerpt from the reply posted to Greg about why this secs
will always be a unique: with secs field added, to get the same
inode-secs string, the uint should overflow in the same second which is
impossible.
Let me know If you still opt for atomic variable only.