2007-10-08 18:08:17

by Greg KH

[permalink] [raw]
Subject: [patch 00/12] 2.6.22.10 -stable review

This is the start of the stable review cycle for the 2.6.22.10 release.
There are 12 patches in this series, all will be posted as a response to
this one. If anyone has any issues with these being applied, please let
us know. If anyone is a maintainer of the proper subsystem, and wants
to add a Signed-off-by: line to the patch, please respond with it.

These patches are sent out with a number of different people on the Cc:
line. If you wish to be a reviewer, please email [email protected] to
add your name to the list. If you want to be off the reviewer list,
also email us.

Responses should be made by Wed October 10, 18:00:00 UTC. Anything
received after that time might be too late.

thanks,

greg k-h


2007-10-08 18:08:29

by Greg KH

[permalink] [raw]
Subject: [patch 01/12] scsi_transport_spi: fix domain validation failure from incorrect width setting

From: James Bottomley <[email protected]>

commit 2302827c95fe0f441025acd5133e532d2eef322b from upstream

Domain Validation in the SPI transport class is failing on boxes with
damaged cables (and failing to the extent that the box hangs). The
problem is that the first test it does is a cable integrity test for
wide transfers and if this fails, it turns the wide bit off. The
problem is that the next set of tests it does turns wide back on
again, with the result that it runs through the entirety of DV with a
known bad setting and then hangs the system.

The attached patch fixes the problem by physically nailing the wide
setting to what it deduces it should be for the whole of Domain
Validation.

Signed-off-by: James Bottomley <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/scsi/scsi_transport_spi.c | 28 ++++++++++++++++++++++------
1 file changed, 22 insertions(+), 6 deletions(-)

--- a/drivers/scsi/scsi_transport_spi.c
+++ b/drivers/scsi/scsi_transport_spi.c
@@ -787,10 +787,12 @@ spi_dv_device_internal(struct scsi_devic
struct scsi_target *starget = sdev->sdev_target;
struct Scsi_Host *shost = sdev->host;
int len = sdev->inquiry_len;
+ int min_period = spi_min_period(starget);
+ int max_width = spi_max_width(starget);
/* first set us up for narrow async */
DV_SET(offset, 0);
DV_SET(width, 0);
-
+
if (spi_dv_device_compare_inquiry(sdev, buffer, buffer, DV_LOOPS)
!= SPI_COMPARE_SUCCESS) {
starget_printk(KERN_ERR, starget, "Domain Validation Initial Inquiry Failed\n");
@@ -798,9 +800,13 @@ spi_dv_device_internal(struct scsi_devic
return;
}

+ if (!scsi_device_wide(sdev)) {
+ spi_max_width(starget) = 0;
+ max_width = 0;
+ }
+
/* test width */
- if (i->f->set_width && spi_max_width(starget) &&
- scsi_device_wide(sdev)) {
+ if (i->f->set_width && max_width) {
i->f->set_width(starget, 1);

if (spi_dv_device_compare_inquiry(sdev, buffer,
@@ -809,6 +815,11 @@ spi_dv_device_internal(struct scsi_devic
!= SPI_COMPARE_SUCCESS) {
starget_printk(KERN_ERR, starget, "Wide Transfers Fail\n");
i->f->set_width(starget, 0);
+ /* Make sure we don't force wide back on by asking
+ * for a transfer period that requires it */
+ max_width = 0;
+ if (min_period < 10)
+ min_period = 10;
}
}

@@ -828,7 +839,8 @@ spi_dv_device_internal(struct scsi_devic

/* now set up to the maximum */
DV_SET(offset, spi_max_offset(starget));
- DV_SET(period, spi_min_period(starget));
+ DV_SET(period, min_period);
+
/* try QAS requests; this should be harmless to set if the
* target supports it */
if (scsi_device_qas(sdev)) {
@@ -837,14 +849,14 @@ spi_dv_device_internal(struct scsi_devic
DV_SET(qas, 0);
}

- if (scsi_device_ius(sdev) && spi_min_period(starget) < 9) {
+ if (scsi_device_ius(sdev) && min_period < 9) {
/* This u320 (or u640). Set IU transfers */
DV_SET(iu, 1);
/* Then set the optional parameters */
DV_SET(rd_strm, 1);
DV_SET(wr_flow, 1);
DV_SET(rti, 1);
- if (spi_min_period(starget) == 8)
+ if (min_period == 8)
DV_SET(pcomp_en, 1);
} else {
DV_SET(iu, 0);
@@ -862,6 +874,10 @@ spi_dv_device_internal(struct scsi_devic
} else {
DV_SET(dt, 1);
}
+ /* set width last because it will pull all the other
+ * parameters down to required values */
+ DV_SET(width, max_width);
+
/* Do the read only INQUIRY tests */
spi_dv_retrain(sdev, buffer, buffer + sdev->inquiry_len,
spi_dv_device_compare_inquiry);

--

2007-10-08 18:08:42

by Greg KH

[permalink] [raw]
Subject: [patch 02/12] sky2: reduce impact of watchdog timer

From: Stephen Hemminger <[email protected]>

This is the 2.6.22 version of a regression fix that is already
in 2.6.23. Change the watchdog timer form 10 per second all the time,
to 1 per second and only if interface is up.

Signed-off-by: Stephen Hemminger <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>


---
drivers/net/sky2.c | 45 ++++++++++++++++++---------------------------
drivers/net/sky2.h | 2 +-
2 files changed, 19 insertions(+), 28 deletions(-)

--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -96,10 +96,6 @@ static int disable_msi = 0;
module_param(disable_msi, int, 0);
MODULE_PARM_DESC(disable_msi, "Disable Message Signaled Interrupt (MSI)");

-static int idle_timeout = 100;
-module_param(idle_timeout, int, 0);
-MODULE_PARM_DESC(idle_timeout, "Watchdog timer for lost interrupts (ms)");
-
static const struct pci_device_id sky2_id_table[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_SYSKONNECT, 0x9000) }, /* SK-9Sxx */
{ PCI_DEVICE(PCI_VENDOR_ID_SYSKONNECT, 0x9E00) }, /* SK-9Exx */
@@ -1693,6 +1689,8 @@ static void sky2_link_up(struct sky2_por

netif_carrier_on(sky2->netdev);

+ mod_timer(&hw->watchdog_timer, jiffies + 1);
+
/* Turn on link LED */
sky2_write8(hw, SK_REG(port, LNK_LED_REG),
LINKLED_ON | LINKLED_BLINK_OFF | LINKLED_LINKSYNC_OFF);
@@ -2384,25 +2382,25 @@ static void sky2_le_error(struct sky2_hw
sky2_write32(hw, Q_ADDR(q, Q_CSR), BMU_CLR_IRQ_CHK);
}

-/* If idle then force a fake soft NAPI poll once a second
- * to work around cases where sharing an edge triggered interrupt.
- */
-static inline void sky2_idle_start(struct sky2_hw *hw)
-{
- if (idle_timeout > 0)
- mod_timer(&hw->idle_timer,
- jiffies + msecs_to_jiffies(idle_timeout));
-}
-
-static void sky2_idle(unsigned long arg)
+/* Force a fake soft NAPI poll to handle lost IRQ's */
+static void sky2_watchdog(unsigned long arg)
{
struct sky2_hw *hw = (struct sky2_hw *) arg;
struct net_device *dev = hw->dev[0];
+ int i, active = 0;

if (__netif_rx_schedule_prep(dev))
__netif_rx_schedule(dev);

- mod_timer(&hw->idle_timer, jiffies + msecs_to_jiffies(idle_timeout));
+ for (i = 0; i < hw->ports; i++) {
+ dev = hw->dev[i];
+ if (!netif_running(dev))
+ continue;
+ ++active;
+ }
+
+ if (active)
+ mod_timer(&hw->watchdog_timer, round_jiffies(jiffies + HZ));
}

/* Hardware/software error handling */
@@ -2692,8 +2690,6 @@ static void sky2_restart(struct work_str

dev_dbg(&hw->pdev->dev, "restarting\n");

- del_timer_sync(&hw->idle_timer);
-
rtnl_lock();
sky2_write32(hw, B0_IMSK, 0);
sky2_read32(hw, B0_IMSK);
@@ -2722,8 +2718,6 @@ static void sky2_restart(struct work_str
}
}

- sky2_idle_start(hw);
-
rtnl_unlock();
}

@@ -3713,11 +3707,9 @@ static int __devinit sky2_probe(struct p
sky2_show_addr(dev1);
}

- setup_timer(&hw->idle_timer, sky2_idle, (unsigned long) hw);
+ setup_timer(&hw->watchdog_timer, sky2_watchdog, (unsigned long) hw);
INIT_WORK(&hw->restart_work, sky2_restart);

- sky2_idle_start(hw);
-
pci_set_drvdata(pdev, hw);

return 0;
@@ -3752,7 +3744,7 @@ static void __devexit sky2_remove(struct
if (!hw)
return;

- del_timer_sync(&hw->idle_timer);
+ del_timer_sync(&hw->watchdog_timer);

flush_scheduled_work();

@@ -3796,7 +3788,7 @@ static int sky2_suspend(struct pci_dev *
if (!hw)
return 0;

- del_timer_sync(&hw->idle_timer);
+ del_timer_sync(&hw->watchdog_timer);
netif_poll_disable(hw->dev[0]);

for (i = 0; i < hw->ports; i++) {
@@ -3862,7 +3854,7 @@ static int sky2_resume(struct pci_dev *p
}

netif_poll_enable(hw->dev[0]);
- sky2_idle_start(hw);
+
return 0;
out:
dev_err(&pdev->dev, "resume failed (%d)\n", err);
@@ -3879,7 +3871,6 @@ static void sky2_shutdown(struct pci_dev
if (!hw)
return;

- del_timer_sync(&hw->idle_timer);
netif_poll_disable(hw->dev[0]);

for (i = 0; i < hw->ports; i++) {
--- a/drivers/net/sky2.h
+++ b/drivers/net/sky2.h
@@ -1921,7 +1921,7 @@ struct sky2_hw {
u32 st_idx;
dma_addr_t st_dma;

- struct timer_list idle_timer;
+ struct timer_list watchdog_timer;
struct work_struct restart_work;
int msi;
wait_queue_head_t msi_wait;

--

2007-10-08 18:11:19

by Greg KH

[permalink] [raw]
Subject: [patch 03/12] sky2: fix VLAN receive processing

From: Stephen Hemminger <[email protected]>

Already upstream.

The length check for truncated frames was not correctly handling
the case where VLAN acceleration had already read the tag.
Also, the Yukon EX has some features that use high bit of status
as security tag.

Signed-off-by: Pierre-Yves Ritschard <[email protected]>
Signed-off-by: Stephen Hemminger <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/net/sky2.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -2049,6 +2049,7 @@ static struct sk_buff *sky2_receive(stru
struct sky2_port *sky2 = netdev_priv(dev);
struct rx_ring_info *re = sky2->rx_ring + sky2->rx_next;
struct sk_buff *skb = NULL;
+ u16 count;

if (unlikely(netif_msg_rx_status(sky2)))
printk(KERN_DEBUG PFX "%s: rx slot %u status 0x%x len %d\n",
@@ -2063,7 +2064,13 @@ static struct sk_buff *sky2_receive(stru
if (!(status & GMR_FS_RX_OK))
goto resubmit;

- if (status >> 16 != length)
+ count = (status & GMR_FS_LEN) >> 16;
+#ifdef SKY2_VLAN_TAG_USED
+ /* Account for vlan tag */
+ if (sky2->vlgrp && (status & GMR_FS_VLAN))
+ count -= VLAN_HLEN;
+#endif
+ if (count != length)
goto len_mismatch;

if (length < copybreak)

--

2007-10-08 18:14:25

by Greg KH

[permalink] [raw]
Subject: [patch 04/12] sky2: fix transmit state on resume

From: Stephen Hemminger <[email protected]>

Already upstream.

After resume, driver has reset the chip so the current state
of transmit checksum offload state machine and DMA state machine
will be undefined.

The fix is to set the state so that first Tx will set MSS and offset
values.

Signed-off-by: Stephen Hemminger <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/net/sky2.c | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)

--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -831,6 +831,20 @@ static inline struct sky2_tx_le *get_tx_
return le;
}

+static void tx_init(struct sky2_port *sky2)
+{
+ struct sky2_tx_le *le;
+
+ sky2->tx_prod = sky2->tx_cons = 0;
+ sky2->tx_tcpsum = 0;
+ sky2->tx_last_mss = 0;
+
+ le = get_tx_le(sky2);
+ le->addr = 0;
+ le->opcode = OP_ADDR64 | HW_OWNER;
+ sky2->tx_addr64 = 0;
+}
+
static inline struct tx_ring_info *tx_le_re(struct sky2_port *sky2,
struct sky2_tx_le *le)
{
@@ -1244,7 +1258,8 @@ static int sky2_up(struct net_device *de
GFP_KERNEL);
if (!sky2->tx_ring)
goto err_out;
- sky2->tx_prod = sky2->tx_cons = 0;
+
+ tx_init(sky2);

sky2->rx_le = pci_alloc_consistent(hw->pdev, RX_LE_BYTES,
&sky2->rx_le_map);

--

2007-10-08 18:18:03

by Greg KH

[permalink] [raw]
Subject: [patch 05/12] libata: update drive blacklists

From: Chuck Ebbert <[email protected]>

Update the libata drive blacklists to the latest in 2.6.23-rc8.

Signed-off-by: Chuck Ebbert <[email protected]>
Cc: Jeff Garzik <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/ata/libata-core.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)

--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -3774,6 +3774,8 @@ static const struct ata_blacklist_entry
{ "SAMSUNG CD-ROM SN-124","N001", ATA_HORKAGE_NODMA },
{ "Seagate STT20000A", NULL, ATA_HORKAGE_NODMA },
{ "IOMEGA ZIP 250 ATAPI", NULL, ATA_HORKAGE_NODMA }, /* temporary fix */
+ { "IOMEGA ZIP 250 ATAPI Floppy",
+ NULL, ATA_HORKAGE_NODMA },

/* Weird ATAPI devices */
{ "TORiSAN DVD-ROM DRD-N216", NULL, ATA_HORKAGE_MAX_SEC_128 },
@@ -3787,7 +3789,13 @@ static const struct ata_blacklist_entry
{ "FUJITSU MHT2060BH", NULL, ATA_HORKAGE_NONCQ },
/* NCQ is broken */
{ "Maxtor 6L250S0", "BANC1G10", ATA_HORKAGE_NONCQ },
+ { "Maxtor 6B200M0", "BANC1BM0", ATA_HORKAGE_NONCQ },
{ "Maxtor 6B200M0", "BANC1B10", ATA_HORKAGE_NONCQ },
+ { "Maxtor 7B250S0", "BANC1B70", ATA_HORKAGE_NONCQ, },
+ { "Maxtor 7B300S0", "BANC1B70", ATA_HORKAGE_NONCQ },
+ { "Maxtor 7V300F0", "VA111630", ATA_HORKAGE_NONCQ },
+ { "HITACHI HDS7250SASUN500G 0621KTAWSD", "K2AOAJ0AHITACHI",
+ ATA_HORKAGE_NONCQ },
/* NCQ hard hangs device under heavier load, needs hard power cycle */
{ "Maxtor 6B250S0", "BANC1B70", ATA_HORKAGE_NONCQ },
/* Blacklist entries taken from Silicon Image 3124/3132
@@ -3801,8 +3809,9 @@ static const struct ata_blacklist_entry
{ "Hitachi HTS541616J9SA00", "SB4OC70P", ATA_HORKAGE_NONCQ, },
{ "WDC WD740ADFD-00NLR1", NULL, ATA_HORKAGE_NONCQ, },
{ "FUJITSU MHV2080BH", "00840028", ATA_HORKAGE_NONCQ, },
-
- /* Devices with NCQ limits */
+ { "ST9160821AS", "3.CLF", ATA_HORKAGE_NONCQ, },
+ { "ST3160812AS", "3.AD", ATA_HORKAGE_NONCQ, },
+ { "SAMSUNG HD401LJ", "ZZ100-15", ATA_HORKAGE_NONCQ, },

/* End Marker */
{ }

--

2007-10-08 18:21:53

by Greg KH

[permalink] [raw]
Subject: [patch 06/12] Fix ppp_mppe kernel stack usage.

From: Michal Schmidt <[email protected]>

commit 45dfd5b5dd20f17fe23dafc5cfe921474d27f849 from upstream

Signed-off-by: Michal Schmidt <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/net/ppp_mppe.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)

--- a/drivers/net/ppp_mppe.c
+++ b/drivers/net/ppp_mppe.c
@@ -136,7 +136,7 @@ struct ppp_mppe_state {
* Key Derivation, from RFC 3078, RFC 3079.
* Equivalent to Get_Key() for MS-CHAP as described in RFC 3079.
*/
-static void get_new_key_from_sha(struct ppp_mppe_state * state, unsigned char *InterimKey)
+static void get_new_key_from_sha(struct ppp_mppe_state * state)
{
struct hash_desc desc;
struct scatterlist sg[4];
@@ -153,8 +153,6 @@ static void get_new_key_from_sha(struct
desc.flags = 0;

crypto_hash_digest(&desc, sg, nbytes, state->sha1_digest);
-
- memcpy(InterimKey, state->sha1_digest, state->keylen);
}

/*
@@ -163,21 +161,21 @@ static void get_new_key_from_sha(struct
*/
static void mppe_rekey(struct ppp_mppe_state * state, int initial_key)
{
- unsigned char InterimKey[MPPE_MAX_KEY_LEN];
struct scatterlist sg_in[1], sg_out[1];
struct blkcipher_desc desc = { .tfm = state->arc4 };

- get_new_key_from_sha(state, InterimKey);
+ get_new_key_from_sha(state);
if (!initial_key) {
- crypto_blkcipher_setkey(state->arc4, InterimKey, state->keylen);
- setup_sg(sg_in, InterimKey, state->keylen);
+ crypto_blkcipher_setkey(state->arc4, state->sha1_digest,
+ state->keylen);
+ setup_sg(sg_in, state->sha1_digest, state->keylen);
setup_sg(sg_out, state->session_key, state->keylen);
if (crypto_blkcipher_encrypt(&desc, sg_out, sg_in,
state->keylen) != 0) {
printk(KERN_WARNING "mppe_rekey: cipher_encrypt failed\n");
}
} else {
- memcpy(state->session_key, InterimKey, state->keylen);
+ memcpy(state->session_key, state->sha1_digest, state->keylen);
}
if (state->keylen == 8) {
/* See RFC 3078 */

--

2007-10-08 18:25:14

by Greg KH

[permalink] [raw]
Subject: [patch 07/12] i2c-algo-bit: Read block data bugfix


From: David Brownell <[email protected]>

In Linus tree already:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=939bc4943d0483961edc45b63a7d27b4ffe547e3

This fixes a bug in the way i2c-algo-bit handles I2C_M_RECV_LEN,
used to implement i2c_smbus_read_block_data(). Previously, in the
absence of PEC (rarely used!) it would NAK the "length" byte:

S addr Rd [A] [length] NA

That prevents the subsequent data bytes from being read:

S addr Rd [A] [length] { A [data] }* NA

The primary fix just reorders two code blocks, so the length used
in the "should I NAK now?" check incorporates the data which it
just read from the slave device.

However, that move also highlighted other fault handling glitches.
This fixes those by abstracting the RX path ack/nak logic, so it
can be used in more than one location.

Signed-off-by: David Brownell <[email protected]>
Signed-off-by: Jean Delvare <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/i2c/algos/i2c-algo-bit.c | 52 ++++++++++++++++++++++++---------------
1 file changed, 32 insertions(+), 20 deletions(-)

--- a/drivers/i2c/algos/i2c-algo-bit.c
+++ b/drivers/i2c/algos/i2c-algo-bit.c
@@ -357,13 +357,29 @@ static int sendbytes(struct i2c_adapter
return wrcount;
}

+static int acknak(struct i2c_adapter *i2c_adap, int is_ack)
+{
+ struct i2c_algo_bit_data *adap = i2c_adap->algo_data;
+
+ /* assert: sda is high */
+ if (is_ack) /* send ack */
+ setsda(adap, 0);
+ udelay((adap->udelay + 1) / 2);
+ if (sclhi(adap) < 0) { /* timeout */
+ dev_err(&i2c_adap->dev, "readbytes: ack/nak timeout\n");
+ return -ETIMEDOUT;
+ }
+ scllo(adap);
+ return 0;
+}
+
static int readbytes(struct i2c_adapter *i2c_adap, struct i2c_msg *msg)
{
int inval;
int rdcount=0; /* counts bytes read */
- struct i2c_algo_bit_data *adap = i2c_adap->algo_data;
unsigned char *temp = msg->buf;
int count = msg->len;
+ const unsigned flags = msg->flags;

while (count > 0) {
inval = i2c_inb(i2c_adap);
@@ -377,28 +393,12 @@ static int readbytes(struct i2c_adapter
temp++;
count--;

- if (msg->flags & I2C_M_NO_RD_ACK) {
- bit_dbg(2, &i2c_adap->dev, "i2c_inb: 0x%02x\n",
- inval);
- continue;
- }
-
- /* assert: sda is high */
- if (count) /* send ack */
- setsda(adap, 0);
- udelay((adap->udelay + 1) / 2);
- bit_dbg(2, &i2c_adap->dev, "i2c_inb: 0x%02x %s\n", inval,
- count ? "A" : "NA");
- if (sclhi(adap)<0) { /* timeout */
- dev_err(&i2c_adap->dev, "readbytes: timeout at ack\n");
- return -ETIMEDOUT;
- };
- scllo(adap);
-
/* Some SMBus transactions require that we receive the
transaction length as the first read byte. */
- if (rdcount == 1 && (msg->flags & I2C_M_RECV_LEN)) {
+ if (rdcount == 1 && (flags & I2C_M_RECV_LEN)) {
if (inval <= 0 || inval > I2C_SMBUS_BLOCK_MAX) {
+ if (!(flags & I2C_M_NO_RD_ACK))
+ acknak(i2c_adap, 0);
dev_err(&i2c_adap->dev, "readbytes: invalid "
"block length (%d)\n", inval);
return -EREMOTEIO;
@@ -409,6 +409,18 @@ static int readbytes(struct i2c_adapter
count += inval;
msg->len += inval;
}
+
+ bit_dbg(2, &i2c_adap->dev, "readbytes: 0x%02x %s\n",
+ inval,
+ (flags & I2C_M_NO_RD_ACK)
+ ? "(no ack/nak)"
+ : (count ? "A" : "NA"));
+
+ if (!(flags & I2C_M_NO_RD_ACK)) {
+ inval = acknak(i2c_adap, count);
+ if (inval < 0)
+ return inval;
+ }
}
return rdcount;
}

--

2007-10-08 18:32:06

by Greg KH

[permalink] [raw]
Subject: [patch 08/12] NLM: Fix a circular lock dependency in lockd

From: Trond Myklebust <[email protected]>

commit 255129d1e9ca0ed3d69d5517fae3e03d7ab4b806 in upstream.

The problem is that the garbage collector for the 'host' structures
nlm_gc_hosts(), holds nlm_host_mutex while calling down to
nlmsvc_mark_resources, which, eventually takes the file->f_mutex.

We cannot therefore call nlmsvc_lookup_host() from within
nlmsvc_create_block, since the caller will already hold file->f_mutex, so
the attempt to grab nlm_host_mutex may deadlock.

Fix the problem by calling nlmsvc_lookup_host() outside the file->f_mutex.

Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/lockd/svclock.c | 29 ++++++++++++++++++-----------
1 file changed, 18 insertions(+), 11 deletions(-)

--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -171,19 +171,14 @@ found:
* GRANTED_RES message by cookie, without having to rely on the client's IP
* address. --okir
*/
-static inline struct nlm_block *
-nlmsvc_create_block(struct svc_rqst *rqstp, struct nlm_file *file,
- struct nlm_lock *lock, struct nlm_cookie *cookie)
+static struct nlm_block *
+nlmsvc_create_block(struct svc_rqst *rqstp, struct nlm_host *host,
+ struct nlm_file *file, struct nlm_lock *lock,
+ struct nlm_cookie *cookie)
{
struct nlm_block *block;
- struct nlm_host *host;
struct nlm_rqst *call = NULL;

- /* Create host handle for callback */
- host = nlmsvc_lookup_host(rqstp, lock->caller, lock->len);
- if (host == NULL)
- return NULL;
-
call = nlm_alloc_call(host);
if (call == NULL)
return NULL;
@@ -366,6 +361,7 @@ nlmsvc_lock(struct svc_rqst *rqstp, stru
struct nlm_lock *lock, int wait, struct nlm_cookie *cookie)
{
struct nlm_block *block = NULL;
+ struct nlm_host *host;
int error;
__be32 ret;

@@ -377,6 +373,10 @@ nlmsvc_lock(struct svc_rqst *rqstp, stru
(long long)lock->fl.fl_end,
wait);

+ /* Create host handle for callback */
+ host = nlmsvc_lookup_host(rqstp, lock->caller, lock->len);
+ if (host == NULL)
+ return nlm_lck_denied_nolocks;

/* Lock file against concurrent access */
mutex_lock(&file->f_mutex);
@@ -385,7 +385,8 @@ nlmsvc_lock(struct svc_rqst *rqstp, stru
*/
block = nlmsvc_lookup_block(file, lock);
if (block == NULL) {
- block = nlmsvc_create_block(rqstp, file, lock, cookie);
+ block = nlmsvc_create_block(rqstp, nlm_get_host(host), file,
+ lock, cookie);
ret = nlm_lck_denied_nolocks;
if (block == NULL)
goto out;
@@ -449,6 +450,7 @@ nlmsvc_lock(struct svc_rqst *rqstp, stru
out:
mutex_unlock(&file->f_mutex);
nlmsvc_release_block(block);
+ nlm_release_host(host);
dprintk("lockd: nlmsvc_lock returned %u\n", ret);
return ret;
}
@@ -477,10 +479,15 @@ nlmsvc_testlock(struct svc_rqst *rqstp,

if (block == NULL) {
struct file_lock *conf = kzalloc(sizeof(*conf), GFP_KERNEL);
+ struct nlm_host *host;

if (conf == NULL)
return nlm_granted;
- block = nlmsvc_create_block(rqstp, file, lock, cookie);
+ /* Create host handle for callback */
+ host = nlmsvc_lookup_host(rqstp, lock->caller, lock->len);
+ if (host == NULL)
+ return nlm_lck_denied_nolocks;
+ block = nlmsvc_create_block(rqstp, host, file, lock, cookie);
if (block == NULL) {
kfree(conf);
return nlm_granted;

--

2007-10-08 18:35:25

by Greg KH

[permalink] [raw]
Subject: [patch 09/12] Fix SMP poweroff hangs

From: Mark Lord <[email protected]>

commit 4047727e5ae33f9b8d2b7766d1994ea6e5ec2991 from upstream

We need to disable all CPUs other than the boot CPU (usually 0) before
attempting to power-off modern SMP machines. This fixes the
hang-on-poweroff issue on my MythTV SMP box, and also on Thomas Gleixner's
new toybox.

Signed-off-by: Mark Lord <[email protected]>
Acked-by: Thomas Gleixner <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/sys.c | 2 ++
1 file changed, 2 insertions(+)

--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -31,6 +31,7 @@
#include <linux/cn_proc.h>
#include <linux/getcpu.h>
#include <linux/task_io_accounting_ops.h>
+#include <linux/cpu.h>

#include <linux/compat.h>
#include <linux/syscalls.h>
@@ -865,6 +866,7 @@ EXPORT_SYMBOL_GPL(kernel_halt);
void kernel_power_off(void)
{
kernel_shutdown_prepare(SYSTEM_POWER_OFF);
+ disable_nonboot_cpus();
printk(KERN_EMERG "Power down.\n");
machine_power_off();
}

--

2007-10-08 18:38:22

by Greg KH

[permalink] [raw]
Subject: [patch 10/12] Fix timer_stats printout of events/sec

From: Anton Blanchard <[email protected]>

commit 74922be1485818ed368c4cf4f0b100f70bf01e08 upstream.

When using /proc/timer_stats on ppc64 I noticed the events/sec field wasnt
accurate. Sometimes the integer part was incorrect due to rounding (we
werent taking the fractional seconds into consideration).

The fraction part is also wrong, we need to pad the printf statement and
take the bottom three digits of 1000 times the value.

Signed-off-by: Anton Blanchard <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
---

kernel/time/timer_stats.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

--- a/kernel/time/timer_stats.c
+++ b/kernel/time/timer_stats.c
@@ -319,8 +319,9 @@ static int tstats_show(struct seq_file *
ms = 1;

if (events && period.tv_sec)
- seq_printf(m, "%ld total events, %ld.%ld events/sec\n", events,
- events / period.tv_sec, events * 1000 / ms);
+ seq_printf(m, "%ld total events, %ld.%03ld events/sec\n",
+ events, events * 1000 / ms,
+ (events * 1000000 / ms) % 1000);
else
seq_printf(m, "%ld total events\n", events);


--

2007-10-08 18:41:43

by Greg KH

[permalink] [raw]
Subject: [patch 11/12] SELinux: clear parent death signal on SID transitions

From: Stephen Smalley <[email protected]>

commit 4ac212ad4e8fafc22fa147fc255ff5fa5435cf33 upstream.

Clear parent death signal on SID transitions to prevent unauthorized
signaling between SIDs.

Signed-off-by: Stephen Smalley <[email protected]>
Acked-by: Eric Paris <[email protected]>
Signed-off-by: James Morris <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
security/selinux/hooks.c | 3 +++
1 file changed, 3 insertions(+)

--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -1906,6 +1906,9 @@ static void selinux_bprm_post_apply_cred
spin_unlock_irq(&current->sighand->siglock);
}

+ /* Always clear parent death signal on SID transitions. */
+ current->pdeath_signal = 0;
+
/* Check whether the new SID can inherit resource limits
from the old SID. If not, reset all soft limits to
the lower of the current task's hard limit and the init

--

2007-10-08 18:44:46

by Greg KH

[permalink] [raw]
Subject: [patch 12/12] i386: Use global flag to disable broken local apic timer on AMD CPUs.

From: Andi Kleen <[email protected]>

commit d3f7eae182b04997be19343a23f7009170f4f7a5 upstream

The Averatec 2370 and some other Turion laptop BIOS seems to program the
ENABLE_C1E MSR inconsistently between cores. This confuses the lapic
use heuristics because when C1E is enabled anywhere it seems to affect
the complete chip.

Use a global flag instead of a per cpu flag to handle this.
If any CPU has C1E enabled disabled lapic use.

Thanks to Cal Peake for debugging.

Cc: [email protected]
Cc: Chuck Ebbert <[email protected]>
Signed-off-by: Andi Kleen <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/i386/kernel/apic.c | 10 ++++------
arch/i386/kernel/cpu/amd.c | 7 ++++++-
include/asm-i386/apic.h | 2 ++
include/asm-i386/cpufeature.h | 2 +-
4 files changed, 13 insertions(+), 8 deletions(-)

--- a/arch/i386/kernel/apic.c
+++ b/arch/i386/kernel/apic.c
@@ -61,8 +61,9 @@ static int enable_local_apic __initdata

/* Local APIC timer verification ok */
static int local_apic_timer_verify_ok;
-/* Disable local APIC timer from the kernel commandline or via dmi quirk */
-static int local_apic_timer_disabled;
+/* Disable local APIC timer from the kernel commandline or via dmi quirk
+ or using CPU MSR check */
+int local_apic_timer_disabled;
/* Local APIC timer works in C2 */
int local_apic_timer_c2_ok;
EXPORT_SYMBOL_GPL(local_apic_timer_c2_ok);
@@ -367,12 +368,9 @@ void __init setup_boot_APIC_clock(void)
long delta, deltapm;
int pm_referenced = 0;

- if (boot_cpu_has(X86_FEATURE_LAPIC_TIMER_BROKEN))
- local_apic_timer_disabled = 1;
-
/*
* The local apic timer can be disabled via the kernel
- * commandline or from the test above. Register the lapic
+ * commandline or from the CPU detection code. Register the lapic
* timer as a dummy clock event source on SMP systems, so the
* broadcast mechanism is used. On UP systems simply ignore it.
*/
--- a/arch/i386/kernel/cpu/amd.c
+++ b/arch/i386/kernel/cpu/amd.c
@@ -3,6 +3,7 @@
#include <linux/mm.h>
#include <asm/io.h>
#include <asm/processor.h>
+#include <asm/apic.h>

#include "cpu.h"

@@ -22,6 +23,7 @@
extern void vide(void);
__asm__(".align 4\nvide: ret");

+#ifdef CONFIG_X86_LOCAL_APIC
#define ENABLE_C1E_MASK 0x18000000
#define CPUID_PROCESSOR_SIGNATURE 1
#define CPUID_XFAM 0x0ff00000
@@ -52,6 +54,7 @@ static __cpuinit int amd_apic_timer_brok
}
return 0;
}
+#endif

int force_mwait __cpuinitdata;

@@ -275,8 +278,10 @@ static void __cpuinit init_amd(struct cp
if (cpuid_eax(0x80000000) >= 0x80000006)
num_cache_leaves = 3;

+#ifdef CONFIG_X86_LOCAL_APIC
if (amd_apic_timer_broken())
- set_bit(X86_FEATURE_LAPIC_TIMER_BROKEN, c->x86_capability);
+ local_apic_timer_disabled = 1;
+#endif

if (c->x86 == 0x10 && !force_mwait)
clear_bit(X86_FEATURE_MWAIT, c->x86_capability);
--- a/include/asm-i386/apic.h
+++ b/include/asm-i386/apic.h
@@ -116,6 +116,8 @@ extern void enable_NMI_through_LVT0 (voi
extern int timer_over_8254;
extern int local_apic_timer_c2_ok;

+extern int local_apic_timer_disabled;
+
#else /* !CONFIG_X86_LOCAL_APIC */
static inline void lapic_shutdown(void) { }

--- a/include/asm-i386/cpufeature.h
+++ b/include/asm-i386/cpufeature.h
@@ -79,7 +79,7 @@
#define X86_FEATURE_ARCH_PERFMON (3*32+11) /* Intel Architectural PerfMon */
#define X86_FEATURE_PEBS (3*32+12) /* Precise-Event Based Sampling */
#define X86_FEATURE_BTS (3*32+13) /* Branch Trace Store */
-#define X86_FEATURE_LAPIC_TIMER_BROKEN (3*32+ 14) /* lapic timer broken in C1 */
+/* 14 free */
#define X86_FEATURE_SYNC_RDTSC (3*32+15) /* RDTSC synchronizes the CPU */

/* Intel-defined CPU features, CPUID level 0x00000001 (ecx), word 4 */

--

2007-10-08 20:01:21

by Roel Kluin

[permalink] [raw]
Subject: Re: [patch 08/12] NLM: Fix a circular lock dependency in lockd

Greg KH wrote:

@@ -477,10 +479,15 @@ nlmsvc_testlock(struct svc_rqst *rqstp,

if (block == NULL) {
struct file_lock *conf = kzalloc(sizeof(*conf), GFP_KERNEL);
+ struct nlm_host *host;

if (conf == NULL)
return nlm_granted;
- block = nlmsvc_create_block(rqstp, file, lock, cookie);
+ /* Create host handle for callback */
+ host = nlmsvc_lookup_host(rqstp, lock->caller, lock->len);
+ if (host == NULL)
+ return nlm_lck_denied_nolocks;
+ block = nlmsvc_create_block(rqstp, host, file, lock, cookie);
if (block == NULL) {
kfree(conf);
return nlm_granted;

To be frankly I don't know what this is about, but shouldn't conf be freed if host == NULL?

2007-10-08 21:02:25

by Greg KH

[permalink] [raw]
Subject: Re: [patch 00/12] 2.6.22.10 -stable review

On Mon, Oct 08, 2007 at 11:05:51AM -0700, Greg KH wrote:
> This is the start of the stable review cycle for the 2.6.22.10 release.
> There are 12 patches in this series, all will be posted as a response to
> this one. If anyone has any issues with these being applied, please let
> us know. If anyone is a maintainer of the proper subsystem, and wants
> to add a Signed-off-by: line to the patch, please respond with it.

The "rolled-up" patch can be found at:
kernel.org/pub/linux/kernel/v2.6/stable-review/patch-2.6.22.10-rc1.gz

thanks,

greg k-h

2007-10-09 15:00:19

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [patch 08/12] NLM: Fix a circular lock dependency in lockd


On Mon, 2007-10-08 at 22:01 +0200, Roel Kluin wrote:
> Greg KH wrote:
>
> @@ -477,10 +479,15 @@ nlmsvc_testlock(struct svc_rqst *rqstp,
>
> if (block == NULL) {
> struct file_lock *conf = kzalloc(sizeof(*conf), GFP_KERNEL);
> + struct nlm_host *host;
>
> if (conf == NULL)
> return nlm_granted;
> - block = nlmsvc_create_block(rqstp, file, lock, cookie);
> + /* Create host handle for callback */
> + host = nlmsvc_lookup_host(rqstp, lock->caller, lock->len);
> + if (host == NULL)
> + return nlm_lck_denied_nolocks;
> + block = nlmsvc_create_block(rqstp, host, file, lock, cookie);
> if (block == NULL) {
> kfree(conf);
> return nlm_granted;
>
> To be frankly I don't know what this is about, but shouldn't conf be freed if host == NULL?

Thanks for spotting this!

Greg, should I resend this patch, or would you prefer an incremental
fix?

Cheers
Trond

2007-10-09 15:12:15

by Olof Johansson

[permalink] [raw]
Subject: Re: [patch 09/12] Fix SMP poweroff hangs

On Mon, Oct 08, 2007 at 11:06:33AM -0700, Greg KH wrote:
> From: Mark Lord <[email protected]>
>
> commit 4047727e5ae33f9b8d2b7766d1994ea6e5ec2991 from upstream
>
> We need to disable all CPUs other than the boot CPU (usually 0) before
> attempting to power-off modern SMP machines. This fixes the
> hang-on-poweroff issue on my MythTV SMP box, and also on Thomas Gleixner's
> new toybox.

Paul Mackerras has reported crashes when rebooting some older ppc64
systems due to this patch.

See http://ozlabs.org/pipermail/linuxppc-dev/2007-October/043969.html.


-Olof

2007-10-09 15:14:18

by Greg KH

[permalink] [raw]
Subject: Re: [patch 08/12] NLM: Fix a circular lock dependency in lockd

On Tue, Oct 09, 2007 at 11:00:28AM -0400, Trond Myklebust wrote:
>
> On Mon, 2007-10-08 at 22:01 +0200, Roel Kluin wrote:
> > Greg KH wrote:
> >
> > @@ -477,10 +479,15 @@ nlmsvc_testlock(struct svc_rqst *rqstp,
> >
> > if (block == NULL) {
> > struct file_lock *conf = kzalloc(sizeof(*conf), GFP_KERNEL);
> > + struct nlm_host *host;
> >
> > if (conf == NULL)
> > return nlm_granted;
> > - block = nlmsvc_create_block(rqstp, file, lock, cookie);
> > + /* Create host handle for callback */
> > + host = nlmsvc_lookup_host(rqstp, lock->caller, lock->len);
> > + if (host == NULL)
> > + return nlm_lck_denied_nolocks;
> > + block = nlmsvc_create_block(rqstp, host, file, lock, cookie);
> > if (block == NULL) {
> > kfree(conf);
> > return nlm_granted;
> >
> > To be frankly I don't know what this is about, but shouldn't conf be freed if host == NULL?
>
> Thanks for spotting this!
>
> Greg, should I resend this patch, or would you prefer an incremental
> fix?

An incremental one would be best.

thanks,

greg k-h

2007-10-09 15:27:20

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [patch 08/12] NLM: Fix a circular lock dependency in lockd


On Tue, 2007-10-09 at 08:13 -0700, Greg KH wrote:
> On Tue, Oct 09, 2007 at 11:00:28AM -0400, Trond Myklebust wrote:
> >
> > On Mon, 2007-10-08 at 22:01 +0200, Roel Kluin wrote:
> > > Greg KH wrote:
> > >
> > > @@ -477,10 +479,15 @@ nlmsvc_testlock(struct svc_rqst *rqstp,
> > >
> > > if (block == NULL) {
> > > struct file_lock *conf = kzalloc(sizeof(*conf), GFP_KERNEL);
> > > + struct nlm_host *host;
> > >
> > > if (conf == NULL)
> > > return nlm_granted;
> > > - block = nlmsvc_create_block(rqstp, file, lock, cookie);
> > > + /* Create host handle for callback */
> > > + host = nlmsvc_lookup_host(rqstp, lock->caller, lock->len);
> > > + if (host == NULL)
> > > + return nlm_lck_denied_nolocks;
> > > + block = nlmsvc_create_block(rqstp, host, file, lock, cookie);
> > > if (block == NULL) {
> > > kfree(conf);
> > > return nlm_granted;
> > >
> > > To be frankly I don't know what this is about, but shouldn't conf be freed if host == NULL?
> >
> > Thanks for spotting this!
> >
> > Greg, should I resend this patch, or would you prefer an incremental
> > fix?
>
> An incremental one would be best.
>
> thanks,
>
> greg k-h

Please see the attachment.

Cheers
Trond


Attachments:
linux-2.6.23-000-fix_memleak_in_nlmsvc_testlock.dif (1.11 kB)

2007-10-09 20:45:34

by Greg KH

[permalink] [raw]
Subject: Re: [stable] [patch 08/12] NLM: Fix a circular lock dependency in lockd

On Tue, Oct 09, 2007 at 11:27:57AM -0400, Trond Myklebust wrote:
>
> Please see the attachment.

Thanks, I've applied this to the tree.

greg k-h

2007-10-09 22:59:59

by Greg KH

[permalink] [raw]
Subject: Re: [stable] [patch 09/12] Fix SMP poweroff hangs

On Tue, Oct 09, 2007 at 10:17:02AM -0500, Olof Johansson wrote:
> On Mon, Oct 08, 2007 at 11:06:33AM -0700, Greg KH wrote:
> > From: Mark Lord <[email protected]>
> >
> > commit 4047727e5ae33f9b8d2b7766d1994ea6e5ec2991 from upstream
> >
> > We need to disable all CPUs other than the boot CPU (usually 0) before
> > attempting to power-off modern SMP machines. This fixes the
> > hang-on-poweroff issue on my MythTV SMP box, and also on Thomas Gleixner's
> > new toybox.
>
> Paul Mackerras has reported crashes when rebooting some older ppc64
> systems due to this patch.
>
> See http://ozlabs.org/pipermail/linuxppc-dev/2007-October/043969.html.

Has anything been done in mainline to possibly fix this?

If not, I think we'll leave it in, as it does solve a number of reported
problems.

thanks,

greg k-h

2007-10-09 23:22:45

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [stable] [patch 09/12] Fix SMP poweroff hangs



On Tue, 9 Oct 2007, Greg KH wrote:

> On Tue, Oct 09, 2007 at 10:17:02AM -0500, Olof Johansson wrote:
> > On Mon, Oct 08, 2007 at 11:06:33AM -0700, Greg KH wrote:
> > > From: Mark Lord <[email protected]>
> > >
> > > commit 4047727e5ae33f9b8d2b7766d1994ea6e5ec2991 from upstream
> > >
> > > We need to disable all CPUs other than the boot CPU (usually 0) before
> > > attempting to power-off modern SMP machines. This fixes the
> > > hang-on-poweroff issue on my MythTV SMP box, and also on Thomas Gleixner's
> > > new toybox.
> >
> > Paul Mackerras has reported crashes when rebooting some older ppc64
> > systems due to this patch.
> >
> > See http://ozlabs.org/pipermail/linuxppc-dev/2007-October/043969.html.
>
> Has anything been done in mainline to possibly fix this?
>
> If not, I think we'll leave it in, as it does solve a number of reported
> problems.

Wrapping it into a #ifdef CONFIG_X86 would be sufficient.

tglx

2007-10-09 23:29:55

by Greg KH

[permalink] [raw]
Subject: Re: [stable] [patch 09/12] Fix SMP poweroff hangs

On Wed, Oct 10, 2007 at 01:21:18AM +0200, Thomas Gleixner wrote:
>
>
> On Tue, 9 Oct 2007, Greg KH wrote:
>
> > On Tue, Oct 09, 2007 at 10:17:02AM -0500, Olof Johansson wrote:
> > > On Mon, Oct 08, 2007 at 11:06:33AM -0700, Greg KH wrote:
> > > > From: Mark Lord <[email protected]>
> > > >
> > > > commit 4047727e5ae33f9b8d2b7766d1994ea6e5ec2991 from upstream
> > > >
> > > > We need to disable all CPUs other than the boot CPU (usually 0) before
> > > > attempting to power-off modern SMP machines. This fixes the
> > > > hang-on-poweroff issue on my MythTV SMP box, and also on Thomas Gleixner's
> > > > new toybox.
> > >
> > > Paul Mackerras has reported crashes when rebooting some older ppc64
> > > systems due to this patch.
> > >
> > > See http://ozlabs.org/pipermail/linuxppc-dev/2007-October/043969.html.
> >
> > Has anything been done in mainline to possibly fix this?
> >
> > If not, I think we'll leave it in, as it does solve a number of reported
> > problems.
>
> Wrapping it into a #ifdef CONFIG_X86 would be sufficient.

Ok, get that patch into Linus's tree and then I'll add it to -stable :)

thanks,

greg k-h

2007-10-09 23:30:26

by Linus Torvalds

[permalink] [raw]
Subject: Re: [stable] [patch 09/12] Fix SMP poweroff hangs



On Wed, 10 Oct 2007, Thomas Gleixner wrote:
>
> Wrapping it into a #ifdef CONFIG_X86 would be sufficient.

Well, the ppc oops seems to be a ppc bug regardless.

If CPU_HOTPLUG isn't defined, the thing does nothing. And if it is
defined, I don't see why/how ppc can validly oops. So I think the first
thing to do is to try to figure out why it oopses, not to disable it for
ppc.

Linus

2007-10-09 23:36:47

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [stable] [patch 09/12] Fix SMP poweroff hangs

On Tue, 9 Oct 2007, Linus Torvalds wrote:
>
>
> On Wed, 10 Oct 2007, Thomas Gleixner wrote:
> >
> > Wrapping it into a #ifdef CONFIG_X86 would be sufficient.
>
> Well, the ppc oops seems to be a ppc bug regardless.
>
> If CPU_HOTPLUG isn't defined, the thing does nothing. And if it is
> defined, I don't see why/how ppc can validly oops. So I think the first
> thing to do is to try to figure out why it oopses, not to disable it for
> ppc.

Fair enough. OTOH for the affected PPC users it's a regression and that's
what I'm concerned of.

tglx


2007-10-09 23:58:21

by Olof Johansson

[permalink] [raw]
Subject: Re: [stable] [patch 09/12] Fix SMP poweroff hangs

On Tue, Oct 09, 2007 at 04:27:06PM -0700, Linus Torvalds wrote:
>
>
> On Wed, 10 Oct 2007, Thomas Gleixner wrote:
> >
> > Wrapping it into a #ifdef CONFIG_X86 would be sufficient.
>
> Well, the ppc oops seems to be a ppc bug regardless.
>
> If CPU_HOTPLUG isn't defined, the thing does nothing. And if it is
> defined, I don't see why/how ppc can validly oops. So I think the first
> thing to do is to try to figure out why it oopses, not to disable it for
> ppc.

The machine Paul tried on most likely has MPIC interrupt controller, and
the oops was when the pseries_cpu_disable tried calling XICS code instead.
It's not surprising that it failed, I don't think IBM has (traditionally)
cared about cpu hotplug on those machines.

So the PPC-side fix is to not enable cpu hotplug on mpic-based
systems. I'll follow up with a patch, but I have no way to test it since
I only have one POWER5 machine, no other IBM hardware. I'd appreciate
it if someone with hardware could verify it.


-Olof

2007-10-10 00:03:20

by Olof Johansson

[permalink] [raw]
Subject: [PATCH] powerpc: don't enable cpu hotplug on mpic-based pseries

Don't allow cpu hotplug on systems lacking XICS interrupt controller,
since current platform code is hardcoded for it.


Signed-off-by: Olof Johansson <[email protected]>


diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index 9711eb0..e29b890 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -252,6 +252,19 @@ static struct notifier_block pseries_smp_nb = {

static int __init pseries_cpu_hotplug_init(void)
{
+ struct device_node *np;
+ const char *typep;
+
+ for (np = NULL; (np = of_find_node_by_name(np,
+ "interrupt-controller"));) {
+ typep = of_get_property(np, "compatible", NULL);
+ if (strstr(typep, "open-pic")) {
+ printk(KERN_INFO "CPU Hotplug not supported on "
+ "systems using MPIC\n");
+ return 0;
+ }
+ }
+
rtas_stop_self_args.token = rtas_token("stop-self");
qcss_tok = rtas_token("query-cpu-stopped-state");

2007-10-10 00:18:52

by Stephen Rothwell

[permalink] [raw]
Subject: Re: [PATCH] powerpc: don't enable cpu hotplug on mpic-based pseries

Hi Olof,

On Tue, 9 Oct 2007 19:08:15 -0500 Olof Johansson <[email protected]> wrote:
>
> Don't allow cpu hotplug on systems lacking XICS interrupt controller,
> since current platform code is hardcoded for it.
>
>
> Signed-off-by: Olof Johansson <[email protected]>
>
>
> diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c
> index 9711eb0..e29b890 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
> @@ -252,6 +252,19 @@ static struct notifier_block pseries_smp_nb = {
>
> static int __init pseries_cpu_hotplug_init(void)
> {
> + struct device_node *np;
> + const char *typep;
> +
> + for (np = NULL; (np = of_find_node_by_name(np,
> + "interrupt-controller"));) {
> + typep = of_get_property(np, "compatible", NULL);
> + if (strstr(typep, "open-pic")) {
> + printk(KERN_INFO "CPU Hotplug not supported on "
> + "systems using MPIC\n");

You need an of_node_put(np) here.

--
Cheers,
Stephen Rothwell [email protected]
http://www.canb.auug.org.au/~sfr/


Attachments:
(No filename) (1.08 kB)
(No filename) (189.00 B)
Download all attachments

2007-10-10 00:33:26

by Olof Johansson

[permalink] [raw]
Subject: [PATCH v2] powerpc: don't enable cpu hotplug on mpic-based pseries

Don't allow cpu hotplug on systems lacking XICS interrupt controller,
since current code is hardcoded for it.


Signed-off-by: Olof Johansson <[email protected]>

---

On Wed, Oct 10, 2007 at 10:18:26AM +1000, Stephen Rothwell wrote:

> > + struct device_node *np;
> > + const char *typep;
> > +
> > + for (np = NULL; (np = of_find_node_by_name(np,
> > + "interrupt-controller"));) {
> > + typep = of_get_property(np, "compatible", NULL);
> > + if (strstr(typep, "open-pic")) {
> > + printk(KERN_INFO "CPU Hotplug not supported on "
> > + "systems using MPIC\n");
>
> You need an of_node_put(np) here.

Grmbl, you're right.

pseries_discover_pic() doesn't have one, that's where I took the above
logic from. So we're obviously already leaking device node references.

Still, no reason to make it worse.


-Olof

diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index 9711eb0..ae85fc0 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -252,6 +252,21 @@ static struct notifier_block pseries_smp_nb = {

static int __init pseries_cpu_hotplug_init(void)
{
+ struct device_node *np;
+ const char *typep;
+
+ for (np = NULL; (np = of_find_node_by_name(np,
+ "interrupt-controller"));) {
+ typep = of_get_property(np, "compatible", NULL);
+ if (strstr(typep, "open-pic")) {
+ of_node_put(np);
+
+ printk(KERN_INFO "CPU Hotplug not supported on "
+ "systems using MPIC\n");
+ return 0;
+ }
+ }
+
rtas_stop_self_args.token = rtas_token("stop-self");
qcss_tok = rtas_token("query-cpu-stopped-state");

2007-10-10 05:32:25

by Linus Torvalds

[permalink] [raw]
Subject: Re: [stable] [patch 09/12] Fix SMP poweroff hangs



On Wed, 10 Oct 2007, Thomas Gleixner wrote:
>
> Fair enough. OTOH for the affected PPC users it's a regression and that's
> what I'm concerned of.

Hmm.. I just got the appended (Kevin, better not just send me email, other
people can be interested too):

From: Kevin <[email protected]>:
Subject: Segmentation Fault
>
> I just installed Linux Kernel 2.6.23 and am getting a Segmentation Fault
> upon exit (init 0).
>
> Code: Bad EIP Value
> EIP: [<00007825>] 0x7825 SS:ESP 0068:e7305de8
> /etc/rc.d/rc.0: Line 261: 2796 Segmentation Fault /sbin/poweroff
>
> I'm running Slackware 12.0

and while there's not a lot of information there, it might still mean that
there may be some issues on x86 too at the shutdown path.

Kevin: can you get the rest of the message? The kernel should have printed
out much more than just those two lines (there should have been a stack
trace and register state (often most easily captured simply with a digital
camera)

Linus

2007-10-10 10:08:56

by Milton Miller

[permalink] [raw]
Subject: Re: [PATCH v2] powerpc: don't enable cpu hotplug on mpic-based pseries


Don't allow cpu hotplug on pSeries systems lacking XICS interrupt controller,
since current code is hardcoded to call xics routines.

Signed-off-by: Milton Miller <[email protected]>
--

Olof's patch searched the device-tree again, looking for an mpic. This
code instead checks that we found an xics the first time by checking the
init function.

diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index 9711eb0..20f010a 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -262,6 +262,12 @@ static int __init pseries_cpu_hotplug_init(void)
return 0;
}

+ if (ppc_md.init_IRQ != xics_init_IRQ) {
+ printk(KERN_INFO "pSeries CPU Hotplug only supported on xics "
+ "interrupt controllers - disabling");
+ return 0;
+ }
+
ppc_md.cpu_die = pseries_mach_cpu_die;
smp_ops->cpu_disable = pseries_cpu_disable;
smp_ops->cpu_die = pseries_cpu_die;

2007-10-10 16:38:33

by Olof Johansson

[permalink] [raw]
Subject: Re: [PATCH v2] powerpc: don't enable cpu hotplug on mpic-based pseries

On Wed, Oct 10, 2007 at 05:08:44AM -0500, Milton Miller wrote:
> Olof's patch searched the device-tree again, looking for an mpic. This
> code instead checks that we found an xics the first time by checking the
> init function.

I'm glad you find the kernel so perfect that your best use of time is
to tweak the code added in a non-critical path (performance-wise) with
a new way of checking for hardware features (no other code in the kernel
checks ppc_md function pointers for that purpose).

Pseries can only have mpic or xics, so it's not like it matters if we
check for mpic or !xics, either.

But hey, any color bike shed will do, and I don't care which one Paul
chooses to merge. I will not waste more time debating this simple patch
though. :)


-Olof

2007-10-11 00:24:18

by Paul Mackerras

[permalink] [raw]
Subject: Re: [stable] [patch 09/12] Fix SMP poweroff hangs

Linus Torvalds writes:

> On Wed, 10 Oct 2007, Thomas Gleixner wrote:
> >
> > Wrapping it into a #ifdef CONFIG_X86 would be sufficient.
>
> Well, the ppc oops seems to be a ppc bug regardless.

Sure. And Milton and Olof have figured out what the problem is and
proposed patches to fix it.

However, I'm a bit annoyed that a significant change in behaviour like
that, with a significant likelihood of causing regressions, went in so
late (after -rc8), in a manner that affects all SMP architectures, and
without being posted to linux-arch (unless I missed it).

Paul.

2007-10-11 05:52:19

by Paul Mackerras

[permalink] [raw]
Subject: Re: [PATCH v2] powerpc: don't enable cpu hotplug on mpic-based pseries

Olof Johansson writes:

> Don't allow cpu hotplug on systems lacking XICS interrupt controller,
> since current code is hardcoded for it.
...
> + for (np = NULL; (np = of_find_node_by_name(np,
> + "interrupt-controller"));) {

Looks like for_each_node_by_name would be nicer here.

If you agree, I'll hand-edit your patch to do that and apply it. Of
course, ultimately we should implement the necessary mpic bits to
support cpu hotplug.

Paul.

2007-10-11 05:54:31

by Olof Johansson

[permalink] [raw]
Subject: Re: [PATCH v2] powerpc: don't enable cpu hotplug on mpic-based pseries

On Thu, Oct 11, 2007 at 03:52:04PM +1000, Paul Mackerras wrote:
> Olof Johansson writes:
>
> > Don't allow cpu hotplug on systems lacking XICS interrupt controller,
> > since current code is hardcoded for it.
> ...
> > + for (np = NULL; (np = of_find_node_by_name(np,
> > + "interrupt-controller"));) {
>
> Looks like for_each_node_by_name would be nicer here.
>
> If you agree, I'll hand-edit your patch to do that and apply it.

Go for it, I just stole that out of the detection code. That
could be changed accordingly as well at some point (I'm guessing
for_each_node_by_name didn't exist when that was written).

> Of course, ultimately we should implement the necessary mpic bits to
> support cpu hotplug.

Yep. I'll leave that to people who have hardware access. :-)


Thanks,

-Olof