> On Jun 13, 2018, at 3:02 PM, James Simmons <[email protected]> wrote:
>
>
>>> With the cleanup of the libcfs SMP handling all UMP handling
>>> was removed. In the process now various NULL pointers and
>>> empty fields are return in the UMP case which causes lustre
>>> to crash hard. Restore the proper UMP handling so Lustre can
>>> properly function.
>>
>> Can't we just get lustre to handle the NULL pointer?
>> Is most cases, the pointer is accessed through an accessor function, and
>> on !CONFIG_SMP, that can be a static inline that doesn't even look at
>> the pointer.
>
> Lots of NULL pointer checks for a structure allocated at libcfs module
> start and only cleaned up at libcfs removal is not a clean approach.
> So I have thought about it and I have to ask why allocate a global
> struct cfs_cpu_table. It could be made static and fill it in which would
> avoid the whole NULL pointer issue. Plus for the UMP case why allocate
> a new cfs_cpu_table with cfs_cpt_table_alloc() which is exactly like
> the default UMP cfs_cpu_table. Instead we could just return the pointer
> to the static default cfs_cpt_tab every time. We still have the NULL
> ctb_cpumask field to deal with. Does that sound like a better solution
> to you? Doug what do you think?

I like your suggestion, James. Always having a cfs_cpu_table so we don’t have to worry about a NULL pointer would be good. I think we can track down the code which relies on the cpumask field to make it safe for the NULL case. Is there possibility of having an empty cpumask rather than it being NULL?

>
>> I really think this is a step backwards. If you can identify specific
>> problems caused by the current code, I'm sure we can fix them.
>>
>>>
>>> Signed-off-by: James Simmons <[email protected]>
>>> Signed-off-by: Amir Shehata <[email protected]>
>>> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7734
>>
>> This bug doesn't seem to mention this patch at all
>>
>>> Reviewed-on: http://review.whamcloud.com/18916
>>
>> Nor does this review.
>
> Yeah its mutated so much from what is in the Intel tree.
> I do believe it was the last patch to touch this.
> _______________________________________________
> lustre-devel mailing list
> [email protected]
> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

2018-06-13 22:31:58

by NeilBrown

[permalink] [raw]

Subject: Re: [PATCH v2 01/25] staging: lustre: libcfs: restore UMP handling

On Wed, Jun 13 2018, James Simmons wrote:

>> > With the cleanup of the libcfs SMP handling all UMP handling
>> > was removed. In the process now various NULL pointers and
>> > empty fields are return in the UMP case which causes lustre
>> > to crash hard. Restore the proper UMP handling so Lustre can
>> > properly function.
>>
>> Can't we just get lustre to handle the NULL pointer?
>> Is most cases, the pointer is accessed through an accessor function, and
>> on !CONFIG_SMP, that can be a static inline that doesn't even look at
>> the pointer.
>
> Lots of NULL pointer checks for a structure allocated at libcfs module
> start and only cleaned up at libcfs removal is not a clean approach.
> So I have thought about it and I have to ask why allocate a global
> struct cfs_cpu_table. It could be made static and fill it in which would
> avoid the whole NULL pointer issue. Plus for the UMP case why allocate
> a new cfs_cpu_table with cfs_cpt_table_alloc() which is exactly like
> the default UMP cfs_cpu_table. Instead we could just return the pointer
> to the static default cfs_cpt_tab every time. We still have the NULL
> ctb_cpumask field to deal with. Does that sound like a better solution
> to you? Doug what do you think?

I'm not convinced there will be lots of NULL pointer checks - maybe one
or two. Most the accesses to the structure are inside helper
functions which are empty inlines in the UP build.

However I don't object to a static cfs_cpt_tab if that turns out to make
some code simpler. I would want it to be clear up-front which code was
simplified so that an informed decision could be made.

Thanks,
NeilBrown

>
>> I really think this is a step backwards. If you can identify specific
>> problems caused by the current code, I'm sure we can fix them.
>>
>> >
>> > Signed-off-by: James Simmons <[email protected]>
>> > Signed-off-by: Amir Shehata <[email protected]>
>> > Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7734
>>
>> This bug doesn't seem to mention this patch at all
>>
>> > Reviewed-on: http://review.whamcloud.com/18916
>>
>> Nor does this review.
>
> Yeah its mutated so much from what is in the Intel tree.
> I do believe it was the last patch to touch this.

Attachments:

signature.asc (847.00 B)