2014-01-14 04:51:15

by Stephen Rothwell

[permalink] [raw]
Subject: linux-next: manual merge of the akpm-current tree with the char-misc tree

Hi Andrew,

Today's linux-next merge of the akpm-current tree got a conflict in
drivers/misc/mei/init.c between commit 33ec08263147 ("mei: revamp mei
reset state machine") from the char-misc tree and commit dd045dab2999
("drivers/misc/mei: ratelimit several error messages") from the
akpm-current tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

--
Cheers,
Stephen Rothwell [email protected]

diff --cc drivers/misc/mei/init.c
index cdd31c2a2a2b,edd3bb6a5df9..000000000000
--- a/drivers/misc/mei/init.c
+++ b/drivers/misc/mei/init.c
@@@ -43,119 -43,41 +43,119 @@@ const char *mei_dev_state_str(int state
#undef MEI_DEV_STATE
}

-void mei_device_init(struct mei_device *dev)
-{
- /* setup our list array */
- INIT_LIST_HEAD(&dev->file_list);
- INIT_LIST_HEAD(&dev->device_list);
- mutex_init(&dev->device_lock);
- init_waitqueue_head(&dev->wait_hw_ready);
- init_waitqueue_head(&dev->wait_recvd_msg);
- init_waitqueue_head(&dev->wait_stop_wd);
- dev->dev_state = MEI_DEV_INITIALIZING;

- mei_io_list_init(&dev->read_list);
- mei_io_list_init(&dev->write_list);
- mei_io_list_init(&dev->write_waiting_list);
- mei_io_list_init(&dev->ctrl_wr_list);
- mei_io_list_init(&dev->ctrl_rd_list);
+/**
+ * mei_cancel_work. Cancel mei background jobs
+ *
+ * @dev: the device structure
+ *
+ * returns 0 on success or < 0 if the reset hasn't succeeded
+ */
+void mei_cancel_work(struct mei_device *dev)
+{
+ cancel_work_sync(&dev->init_work);
+ cancel_work_sync(&dev->reset_work);

- INIT_DELAYED_WORK(&dev->timer_work, mei_timer);
- INIT_WORK(&dev->init_work, mei_host_client_init);
+ cancel_delayed_work(&dev->timer_work);
+}
+EXPORT_SYMBOL_GPL(mei_cancel_work);

- INIT_LIST_HEAD(&dev->wd_cl.link);
- INIT_LIST_HEAD(&dev->iamthif_cl.link);
- mei_io_list_init(&dev->amthif_cmd_list);
- mei_io_list_init(&dev->amthif_rd_complete_list);
+/**
+ * mei_reset - resets host and fw.
+ *
+ * @dev: the device structure
+ */
+int mei_reset(struct mei_device *dev)
+{
+ enum mei_dev_state state = dev->dev_state;
+ bool interrupts_enabled;
+ int ret;

- bitmap_zero(dev->host_clients_map, MEI_CLIENTS_MAX);
- dev->open_handle_count = 0;
+ if (state != MEI_DEV_INITIALIZING &&
+ state != MEI_DEV_DISABLED &&
+ state != MEI_DEV_POWER_DOWN &&
+ state != MEI_DEV_POWER_UP)
- dev_warn(&dev->pdev->dev, "unexpected reset: dev_state = %s\n",
++ dev_warn_ratelimited(&dev->pdev->dev, "unexpected reset: dev_state = %s\n",
+ mei_dev_state_str(state));

- /*
- * Reserving the first client ID
- * 0: Reserved for MEI Bus Message communications
+ /* we're already in reset, cancel the init timer
+ * if the reset was called due the hbm protocol error
+ * we need to call it before hw start
+ * so the hbm watchdog won't kick in
*/
- bitmap_set(dev->host_clients_map, 0, 1);
+ mei_hbm_idle(dev);
+
+ /* enter reset flow */
+ interrupts_enabled = state != MEI_DEV_POWER_DOWN;
+ dev->dev_state = MEI_DEV_RESETTING;
+
+ dev->reset_count++;
+ if (dev->reset_count > MEI_MAX_CONSEC_RESET) {
+ dev_err(&dev->pdev->dev, "reset: reached maximal consecutive resets: disabling the device\n");
+ dev->dev_state = MEI_DEV_DISABLED;
+ return -ENODEV;
+ }
+
+ ret = mei_hw_reset(dev, interrupts_enabled);
+ /* fall through and remove the sw state even if hw reset has failed */
+
+ /* no need to clean up software state in case of power up */
+ if (state != MEI_DEV_INITIALIZING &&
+ state != MEI_DEV_POWER_UP) {
+
+ /* remove all waiting requests */
+ mei_cl_all_write_clear(dev);
+
+ mei_cl_all_disconnect(dev);
+
+ /* wake up all readers and writers so they can be interrupted */
+ mei_cl_all_wakeup(dev);
+
+ /* remove entry if already in list */
+ dev_dbg(&dev->pdev->dev, "remove iamthif and wd from the file list.\n");
+ mei_cl_unlink(&dev->wd_cl);
+ mei_cl_unlink(&dev->iamthif_cl);
+ mei_amthif_reset_params(dev);
+ memset(&dev->wr_ext_msg, 0, sizeof(dev->wr_ext_msg));
+ }
+
+
+ dev->me_clients_num = 0;
+ dev->rd_msg_hdr = 0;
+ dev->wd_pending = false;
+
+ if (ret) {
+ dev_err(&dev->pdev->dev, "hw_reset failed ret = %d\n", ret);
+ dev->dev_state = MEI_DEV_DISABLED;
+ return ret;
+ }
+
+ if (state == MEI_DEV_POWER_DOWN) {
+ dev_dbg(&dev->pdev->dev, "powering down: end of reset\n");
+ dev->dev_state = MEI_DEV_DISABLED;
+ return 0;
+ }
+
+ ret = mei_hw_start(dev);
+ if (ret) {
+ dev_err(&dev->pdev->dev, "hw_start failed ret = %d\n", ret);
+ dev->dev_state = MEI_DEV_DISABLED;
+ return ret;
+ }
+
+ dev_dbg(&dev->pdev->dev, "link is established start sending messages.\n");
+
+ dev->dev_state = MEI_DEV_INIT_CLIENTS;
+ ret = mei_hbm_start_req(dev);
+ if (ret) {
+ dev_err(&dev->pdev->dev, "hbm_start failed ret = %d\n", ret);
+ dev->dev_state = MEI_DEV_DISABLED;
+ return ret;
+ }
+
+ return 0;
}
-EXPORT_SYMBOL_GPL(mei_device_init);
+EXPORT_SYMBOL_GPL(mei_reset);

/**
* mei_start - initializes host and fw to start work.


Attachments:
(No filename) (4.96 kB)
(No filename) (836.00 B)
Download all attachments

2014-01-14 08:31:49

by Tomas Winkler

[permalink] [raw]
Subject: RE: linux-next: manual merge of the akpm-current tree with the char-misc tree



> -----Original Message-----
> From: Stephen Rothwell [mailto:[email protected]]
> Sent: Tuesday, January 14, 2014 06:51
> To: Andrew Morton; Greg KH; Arnd Bergmann
> Cc: [email protected]; [email protected]; Ian Munsie;
> Winkler, Tomas
> Subject: linux-next: manual merge of the akpm-current tree with the char-misc
> tree
>
> Hi Andrew,
>
> Today's linux-next merge of the akpm-current tree got a conflict in
> drivers/misc/mei/init.c between commit 33ec08263147 ("mei: revamp mei
> reset state machine") from the char-misc tree and commit dd045dab2999
> ("drivers/misc/mei: ratelimit several error messages") from the
> akpm-current tree.
>
> I fixed it up (see below) and can carry the fix as necessary (no action
> is required).

Can we just drop this rete limit stuff, I've never asked that.
Tomas

2014-01-15 01:57:50

by Ian Munsie

[permalink] [raw]
Subject: RE: linux-next: manual merge of the akpm-current tree with the char-misc tree

Excerpts from Winkler, Tomas's message of 2014-01-14 19:31:26 +1100:
> > Today's linux-next merge of the akpm-current tree got a conflict in
> > drivers/misc/mei/init.c between commit 33ec08263147 ("mei: revamp mei
> > reset state machine") from the char-misc tree and commit dd045dab2999
> > ("drivers/misc/mei: ratelimit several error messages") from the
> > akpm-current tree.
> >
> > I fixed it up (see below) and can carry the fix as necessary (no action
> > is required).
>
> Can we just drop this rete limit stuff, I've never asked that.
> Tomas

Hi Tomas,

So far the problem has only been a one off thing for me, so it would
seem that whatever circumstances contributed to it are fairly rare, but
unless the underlying issue has been identified and fixed I would not
recommend just dropping the patch. When it did hit I ended up with my
log files (kern.log, syslog & messages) filled up with 15GB of the
messages mentioned in the commit message within minutes, until my hard
drive ran out of space bringing my system down.

Even if the underlying issue is fixed I do not see any advantage in
dropping the rate limit patch - it is an absolutely trivial* patch, and
if anything I would expand it to cover all the error messages in the
driver, not just the three involved in that particular case.

* It's essentially: sed 's/dev_\(warn\|err\)\>/dev_\1_ratelimited/g'

-Ian

2014-01-15 22:38:04

by Tomas Winkler

[permalink] [raw]
Subject: RE: linux-next: manual merge of the akpm-current tree with the char-misc tree



> -----Original Message-----
> From: Ian Munsie [mailto:[email protected]]
> Sent: Wednesday, January 15, 2014 03:58
> To: Winkler, Tomas
> Cc: Stephen Rothwell; Andrew Morton; Greg KH; Arnd Bergmann; linux-
> [email protected]; [email protected]
> Subject: RE: linux-next: manual merge of the akpm-current tree with the char-
> misc tree
>
> Excerpts from Winkler, Tomas's message of 2014-01-14 19:31:26 +1100:
> > > Today's linux-next merge of the akpm-current tree got a conflict in
> > > drivers/misc/mei/init.c between commit 33ec08263147 ("mei: revamp mei
> > > reset state machine") from the char-misc tree and commit dd045dab2999
> > > ("drivers/misc/mei: ratelimit several error messages") from the
> > > akpm-current tree.
> > >
> > > I fixed it up (see below) and can carry the fix as necessary (no action
> > > is required).
> >
> > Can we just drop this rete limit stuff, I've never asked that.
> > Tomas
>
> Hi Tomas,
>
> So far the problem has only been a one off thing for me, so it would
> seem that whatever circumstances contributed to it are fairly rare, but
> unless the underlying issue has been identified and fixed I would not
> recommend just dropping the patch. When it did hit I ended up with my
> log files (kern.log, syslog & messages) filled up with 15GB of the
> messages mentioned in the commit message within minutes, until my hard
> drive ran out of space bringing my system down.
>
> Even if the underlying issue is fixed I do not see any advantage in
> dropping the rate limit patch - it is an absolutely trivial* patch, and
> if anything I would expand it to cover all the error messages in the
> driver, not just the three involved in that particular case.
>
> * It's essentially: sed 's/dev_\(warn\|err\)\>/dev_\1_ratelimited/g'
>

I think the issue was fixed and also then number of consecutive resets were limited to 3 so you should
not see this issue anymore in the next release, your patch just conflicted with these fixes.
Still need to provide a simpler fix for the stable kernel.

Thanks
Tomas
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2014-01-16 21:03:15

by Ian Munsie

[permalink] [raw]
Subject: RE: linux-next: manual merge of the akpm-current tree with the char-misc tree

Excerpts from Winkler, Tomas's message of 2014-01-16 09:36:40 +1100:
> I think the issue was fixed and also then number of consecutive resets were limited to 3 so you should
> not see this issue anymore in the next release, your patch just conflicted with these fixes.
> Still need to provide a simpler fix for the stable kernel.

Ok, If you think it should not occur again I don't mind if my patch is
dropped.

Cheers,
-Ian