Subject: Re: corruption causing crash in __queue_work
To: Tejun Heo <tj@kernel.org>
References: <566819D8.5090804@kyup.com>
 <20151209160803.GK30240@mtj.duckdns.org> <56685573.1020805@kyup.com>
 <20151209162744.GN30240@mtj.duckdns.org> <566945A2.1050208@kyup.com>
 <20151210152901.GR30240@mtj.duckdns.org>
Cc: "Linux-Kernel@Vger. Kernel. Org" <linux-kernel@vger.kernel.org>,
        SiteGround Operations <operations@siteground.com>
From: Nikolay Borisov <kernel@kyup.com>
Message-ID: <566AF262.8050009@kyup.com>
Date: Fri, 11 Dec 2015 17:57:22 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101
 Thunderbird/38.1.0
MIME-Version: 1.0
In-Reply-To: <20151210152901.GR30240@mtj.duckdns.org>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 3756
Lines: 134


On 12/10/2015 05:29 PM, Tejun Heo wrote:
> On Thu, Dec 10, 2015 at 11:28:02AM +0200, Nikolay Borisov wrote:
>> On 12/09/2015 06:27 PM, Tejun Heo wrote:
>>> Hello,
>>>
>>> On Wed, Dec 09, 2015 at 06:23:15PM +0200, Nikolay Borisov wrote:
>>>> I think we are seeing this at least daily on at least 1 server (we have
>>>> multiple servers like that). So adding printk's would likely be the way
>>>> to go, anything in particular you might be interested in knowing? I see
>>>> RCU stuff around so might be tricky race condition.
>>>
>>> Printing out the workqueue's pointer, name, pwq's pointer, the node
>>> being installed for and the installed pointer should give us enough
>>> clues.  There's RCU involved but the pointers shouldn't be becoming
>>> NULLs unless we're installing NULL ptrs.
>>
>> So the debug patch has been rolled on 1 server and several more 
>> are in the process, here it is what it prints: 
>>
>> WQ: ffff88046f00ba00 (events_unbound) old_pwq:           (null) new_pwq: ffff88046f00d300 node: 0
> ...
>> Is this format ok? Also I observed the exact same crash
>> on a machine running 4.1.12 kernel as well. 
> 
> Yeah, I think it can be a good starting point.

So I had a server with the patch just crash on me:

Here is how the queue looks like:
 crash> struct workqueue_struct 0xffff8802420a4a00
struct workqueue_struct {
  pwqs = {
    next = 0xffff8802420a4c00,
    prev = 0xffff8802420a4a00
  },
  list = {
    next = 0xffff880351f9b210,
    prev = 0xdead000000200200
  },
  mutex = {
    count = {
      counter = 1
    },
    wait_lock = {
      {
        rlock = {
          raw_lock = {
            val = {
              counter = 0
            }
          }
        }
      }
    },
    wait_list = {
      next = 0xffff8802420a4a28,
      prev = 0xffff8802420a4a28
    },
    owner = 0x0,
    osq = {
      tail = {
        counter = 0
      }
    }
  },
  work_color = 3,
  flush_color = 3,
  nr_pwqs_to_flush = {
    counter = 0
  },
  first_flusher = 0x0,
  flusher_queue = {
    next = 0xffff8802420a4a60,
    prev = 0xffff8802420a4a60
  },
  flusher_overflow = {
    next = 0xffff8802420a4a70,
    prev = 0xffff8802420a4a70
  },
  maydays = {
    next = 0xffff8802420a4a80,
    prev = 0xffff8802420a4a80
  },
  rescuer = 0xffff88046932ce40,
  nr_drainers = 0,
  saved_max_active = 1,
  unbound_attrs = 0xffff8801a76c1f00,
  dfl_pwq = 0x0,
  wq_dev = 0x0,
  name =
"dm-thin\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000",
  rcu = {
    next = 0xffff8802531c4c20,
    func = 0xffffffff810692e0 <rcu_free_wq>
  },
  flags = 131082,
  cpu_pwqs = 0x0,
  numa_pwq_tbl = 0xffff8802420a4b10
}

crash> rd 0xffff8802420a4b10 2 (the machine has 2 NUMA nodes hence the
'2' argument)
ffff8802420a4b10:  0000000000000000 0000000000000000   ................

At the same time searching for 0xffff8802420a4a00 in the debug output
shows nothing IOW it seems that the numa_pwq_tbl is never installed for
this workqueue apparently:

[root@smallvault8 ~]# grep 0xffff8802420a4a00 /var/log/messages

Also dumping all the logs from the dmesg contained in the vmcore image I
find nothing and when I do the following correlation:
[root@smallvault8 ~]# grep \(null\) wq.log | wc -l
1940
[root@smallvault8 ~]# wc -l wq.log
1940 wq.log

It seems what's happening is really just changing the numa_pwq_tbl on
workqueue creation i.e. it is never re-assigned. So at this point I
think it seems that there is a situation where the wqattr are not being
applied at all.


> 
> Thanks.
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/