Neil Brown wrote:
>
>If I'm reading this correctly, this bug is introduced by your previous
>patch.
>
>
Depending on how you see the issue. From my end, I view this as the
existing code has a "trap" and I fell into it. This is probably a chance
to clean up this logic.
>The important difference between the old code and the new code here is
>that the old code tests "file->f_locks" while the new code iterates
>through i_flock to see if there are any lockd locks.
>
>f_locks is set to a count of lockd locks in nlm_traverse_locks which
>*was* always called by nlm_inspect_file which is called immediately
>before the code you are changing.
>But since your patch, nlm_inspect_file does not always call
>nlm_traverse_locks, so there is a chance that f_locks is wrong.
>
>With this patch, f_locks it not used at all any more.
>
>
Yes, a fair description of the issue !
>Introducing a bug in one patch and fixing in the next is bad style.
>
>
ok .....
>Some options:
>
> Have an initial patch which removes all references to f_locks and
> includes the change in this patch. With that in place your main
> patch won't introduce a bug. If you do this, you should attempt to
> understand and justify the performance impact (does nlm_traverse_files
> become quadratic in number of locks. Is that acceptable?).
>
> Change the first patch to explicitly update f_count if you bypass the
> call to nlm_inspect_file.
>
> something else???
>
>
Let's see what hch says in another email... will come back to this soon.
>So NAK for this one in it's current form... unless I've misunderstood
>something.
>
>
>
I expect this :)
-- Wendy
Wendy Cheng wrote:
> Neil Brown wrote:
>
>> Some options:
>>
>> Have an initial patch which removes all references to f_locks and
>> includes the change in this patch. With that in place your main
>> patch won't introduce a bug. If you do this, you should attempt to
>> understand and justify the performance impact (does nlm_traverse_files
>> become quadratic in number of locks. Is that acceptable?).
>>
>> Change the first patch to explicitly update f_count if you bypass the
>> call to nlm_inspect_file.
>>
>> something else???
>>
>>
>
> Let's see what hch says in another email... will come back to this soon.
>
Will do a quick and dirty performance measure tomorrow when I get to the
office. Then we'll go from there.
-- Wendy
Wendy Cheng wrote:
> Wendy Cheng wrote:
>
>> Neil Brown wrote:
>>
>>> Some options:
>>>
>>> Have an initial patch which removes all references to f_locks and
>>> includes the change in this patch. With that in place your main
>>> patch won't introduce a bug. If you do this, you should attempt to
>>> understand and justify the performance impact (does nlm_traverse_files
>>> become quadratic in number of locks. Is that acceptable?).
>>>
>>> Change the first patch to explicitly update f_count if you bypass the
>>> call to nlm_inspect_file.
>>>
>>> something else???
>>>
>>>
>>
>> Let's see what hch says in another email... will come back to this soon.
>>
>
> Will do a quick and dirty performance measure tomorrow when I get to
> the office. Then we'll go from there.
The test didn't go well and I'm still debugging .. However, let's
revisit the issue:
[quot from Neil's comment]
The important difference between the old code and the new code here is
that the old code tests "file->f_locks" while the new code iterates
through i_flock to see if there are any lockd locks.
f_locks is set to a count of lockd locks in nlm_traverse_locks which
*was* always called by nlm_inspect_file which is called immediately
before the code you are changing.
But since your patch, nlm_inspect_file does not always call
nlm_traverse_locks, so there is a chance that f_locks is wrong.
With this patch, f_locks it not used at all any more.
[end quot]
The point here is "with this patch, f_locks it not used at all any
more." Note that we have a nice inline function "nlm_file_inuse", why
should we use f_locks (that I assume people agree that it is awkward) ?
Could we simply drop f_locks all together in this section of code?
-- Wendy
Wendy Cheng wrote:
> The point here is "with this patch, f_locks it not used at all any
> more." Note that we have a nice inline function "nlm_file_inuse", why
> should we use f_locks (that I assume people agree that it is awkward)
> ? Could we simply drop f_locks all together in this section of code?
>
Start to have a second thought about this ....
Removing f_locks does make the code much more readable. However, if
inode->i_flock list is long, e.g. large amount of (clients) hosts and/or
processes from other hosts competing for the same lock, we don't want to
do the list walk twice within nlm_traverse_files(). Intuitively this is
unlikely but I prefer not changing the current behavior. As the result,
I'm withdrawing the patch all together.
The first patch will handle the issue correctly. Will submit it after
this post.
-- Wendy