2023-09-22 14:02:33

by Halil Pasic

[permalink] [raw]
Subject: Re: [PATCH] s390/cio: Fix a memleak in css_alloc_subchannel

On Fri, 22 Sep 2023 14:25:58 +0200
Cornelia Huck <[email protected]> wrote:

> > - spin_lock_init(&sch->lock);
> > + sch->schid = schid;
> > + if (cio_is_console(schid)) {
> > + sch->lock = cio_get_console_lock();
> > + } else {
> > + err = cio_create_sch_lock(sch);
> > + if (err)
> > + goto out;
> > + }
> >
> > I did not spend a huge amount of time looking at this but this
> > is the only reason I found for sch->lock being made a pointer. There may
> > be others, I'm just saying that is all I've found.
>
> Author of 2ec2298412e1 here. If I don't completely misremember things,
> this was for the orphanage stuff (i.e. ccw devices that were still kept
> as disconnected, like dasd still in use, that had to be moved from their
> old subchannel object because a different device appeared on that
> subchannel.) That orphanage used a single dummy subchannel for all ccw
> devices moved there.
>
> I have no idea how the current common I/O layer works, but that might
> give you a hint about what to look for :)

Yes, that is what the commit states and what the series is about. I hope
Vineeth can give us some answers :) maybe even out of the top of his
head... If not, I would trust his judgment on whether figuring things
out is worthwhile or not.


Regards,
Halil


2023-09-23 01:02:07

by Vineeth Vijayan

[permalink] [raw]
Subject: Re: [PATCH] s390/cio: Fix a memleak in css_alloc_subchannel



On 9/22/23 15:20, Halil Pasic wrote:
>> Author of 2ec2298412e1 here. If I don't completely misremember things,
>> this was for the orphanage stuff (i.e. ccw devices that were still kept
>> as disconnected, like dasd still in use, that had to be moved from their
>> old subchannel object because a different device appeared on that
>> subchannel.) That orphanage used a single dummy subchannel for all ccw
>> devices moved there.
>>
>> I have no idea how the current common I/O layer works, but that might
>> give you a hint about what to look for ????
> Yes, that is what the commit states and what the series is about. I hope
> Vineeth can give us some answers ???? maybe even out of the top of his
> head... If not, I would trust his judgment on whether figuring things
> out is worthwhile or not.
>
As Corny mentioned, orphanage is the only case i remember where
this scenario of dynamically allocated sch->lock being used. I hope
you remember the cdev->ccwlock, which is nothing but the copy of
sch->lock pointer. This is rather a tricky design, where we are using
the sch->lock and cdev->ccwlock, which are same pointers.
Because this sch is exclusively for the cdev ops. But at the same time,
a CC3 code in the stsch can make the attached device an orphanage and
remove the sch.

We have already seen an issue with this approach and had couple of
discussions about avoiding this pointer usage without using an extra
lock but do not have a right solution for this now.

2023-09-24 21:01:13

by Halil Pasic

[permalink] [raw]
Subject: Re: [PATCH] s390/cio: Fix a memleak in css_alloc_subchannel

On Fri, 22 Sep 2023 21:15:48 +0200
Vineeth Vijayan <[email protected]> wrote:

> On 9/22/23 15:20, Halil Pasic wrote:
> >> Author of 2ec2298412e1 here. If I don't completely misremember things,
> >> this was for the orphanage stuff (i.e. ccw devices that were still kept
> >> as disconnected, like dasd still in use, that had to be moved from their
> >> old subchannel object because a different device appeared on that
> >> subchannel.) That orphanage used a single dummy subchannel for all ccw
> >> devices moved there.
> >>
> >> I have no idea how the current common I/O layer works, but that might
> >> give you a hint about what to look for ????
> > Yes, that is what the commit states and what the series is about. I hope
> > Vineeth can give us some answers ???? maybe even out of the top of his
> > head... If not, I would trust his judgment on whether figuring things
> > out is worthwhile or not.
> >
> As Corny mentioned, orphanage is the only case i remember where
> this scenario of dynamically allocated sch->lock being used. I hope
> you remember the cdev->ccwlock, which is nothing but the copy of
> sch->lock pointer. This is rather a tricky design, where we are using
> the sch->lock and cdev->ccwlock, which are same pointers.
> Because this sch is exclusively for the cdev ops. But at the same time,
> a CC3 code in the stsch can make the attached device an orphanage and
> remove the sch.
>
> We have already seen an issue with this approach and had couple of
> discussions about avoiding this pointer usage without using an extra
> lock but do not have a right solution for this now.

Based on your response it seem you do understand the problem but are
struggling to find a solution. You are ahead of me. I'm still at the
stage where I don't understand the problem. I had another look at
that orphanage code, especially at ccw_device_move_to_sch(). Looks
to me that the *(sch->lock) ins not required outlive the *sch and
also that there is no move semantic in place.

Based on that let's take this offline, find a quiet hour and have a look
at the code and the problem. Maybe I can help with the solution once I
understand the problem -- but maybe not.

Regards,
Halil