LinuxLists.cc - [patch 0/8] 2.6.23-stable review

2008-02-23 00:18:40

Subject: [patch 0/8] 2.6.23-stable review

This is the start of the stable review cycle for the 2.6.23.17 release.
There are 8 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let us know. If anyone is a maintainer of the proper subsystem, and
wants to add a Signed-off-by: line to the patch, please respond with it.

These patches are sent out with a number of different people on the
Cc: line. If you wish to be a reviewer, please email [email protected]
to add your name to the list. If you want to be off the reviewer list,
also email us.

Responses should be made by Tuesday, Feb 25, 2008, 00:10:00 UTC.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v2.6/stable-review/patch-2.6.23.17-rc1.gz
and the diffstat can be found below.

Note, this will probably be the last 2.6.23-stable release done by me,
barring any severe problems found.

thanks,

greg k-h

----------

Makefile | 2 -
arch/powerpc/platforms/powermac/feature.c | 11 +++++++-
arch/x86_64/mm/pageattr.c | 2 -
drivers/macintosh/smu.c | 25 ++++++++++++++++++-
drivers/scsi/sd.c | 34 ++++++++++++--------------
fs/nfs/write.c | 20 +++++++++++++--
include/asm-powerpc/pmac_feature.h | 8 ++++++
include/linux/ktime.h | 2 +
kernel/futex.c | 2 -
kernel/futex_compat.c | 2 -
kernel/hrtimer.c | 38 ++++++++++++++++--------------
kernel/irq/chip.c | 20 +++++++++++++++
kernel/posix-timers.c | 8 +++---
mm/memory.c | 2 +
net/netfilter/nf_conntrack_proto_tcp.c | 33 ++++++++++++++++++++------
15 files changed, 154 insertions(+), 55 deletions(-)

2008-02-23 00:19:01

by Greg KH

[permalink] [raw]

Subject: [patch 1/8] SCSI: sd: handle bad lba in sense information

2.6.23-stable review patch. If anyone has any objections, please let us know.

------------------

From: James Bottomley <[email protected]>

patch 366c246de9cec909c5eba4f784c92d1e75b4dc38 in mainline.

Some devices report medium error locations incorrectly. Add guards to
make sure the reported bad lba is actually in the request that caused
it. Additionally remove the large case statment for sector sizes and
replace it with the proper u64 divisions.

Tested-by: Mike Snitzer <[email protected]>
Cc: Stable Tree <[email protected]>
Cc: Tony Battersby <[email protected]>
Signed-off-by: James Bottomley <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/scsi/sd.c | 34 ++++++++++++++++------------------
1 file changed, 16 insertions(+), 18 deletions(-)

--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -901,6 +901,7 @@ static void sd_rw_intr(struct scsi_cmnd
unsigned int xfer_size = SCpnt->request_bufflen;
unsigned int good_bytes = result ? 0 : xfer_size;
u64 start_lba = SCpnt->request->sector;
+ u64 end_lba = SCpnt->request->sector + (xfer_size / 512);
u64 bad_lba;
struct scsi_sense_hdr sshdr;
int sense_valid = 0;
@@ -939,26 +940,23 @@ static void sd_rw_intr(struct scsi_cmnd
goto out;
if (xfer_size <= SCpnt->device->sector_size)
goto out;
- switch (SCpnt->device->sector_size) {
- case 256:
+ if (SCpnt->device->sector_size < 512) {
+ /* only legitimate sector_size here is 256 */
start_lba <<= 1;
- break;
- case 512:
- break;
- case 1024:
- start_lba >>= 1;
- break;
- case 2048:
- start_lba >>= 2;
- break;
- case 4096:
- start_lba >>= 3;
- break;
- default:
- /* Print something here with limiting frequency. */
- goto out;
- break;
+ end_lba <<= 1;
+ } else {
+ /* be careful ... don't want any overflows */
+ u64 factor = SCpnt->device->sector_size / 512;
+ do_div(start_lba, factor);
+ do_div(end_lba, factor);
}
+
+ if (bad_lba < start_lba || bad_lba >= end_lba)
+ /* the bad lba was reported incorrectly, we have
+ * no idea where the error is
+ */
+ goto out;
+
/* This computation should always be done in terms of
* the resolution of the device's medium.
*/

--

2008-02-23 00:19:36

by Greg KH

[permalink] [raw]

Subject: [patch 2/8] NFS: Fix a potential file corruption issue when writing

2.6.23-stable review patch. If anyone has any objections, please let us
know.

------------------
From: Trond Myklebust <[email protected]>

patch 5d47a35600270e7115061cb1320ee60ae9bcb6b8 in mainline.

If the inode is flagged as having an invalid mapping, then we can't rely on
the PageUptodate() flag. Ensure that we don't use the "anti-fragmentation"
write optimisation in nfs_updatepage(), since that will cause NFS to write
out areas of the page that are no longer guaranteed to be up to date.

A potential corruption could occur in the following scenario:

client 1 client 2
=============== ===============
fd=open("f",O_CREAT|O_WRONLY,0644);
write(fd,"fubar\n",6); // cache last page
close(fd);
fd=open("f",O_WRONLY|O_APPEND);
write(fd,"foo\n",4);
close(fd);

fd=open("f",O_WRONLY|O_APPEND);
write(fd,"bar\n",4);
close(fd);
-----
The bug may lead to the file "f" reading 'fubar\n\0\0\0\nbar\n' because
client 2 does not update the cached page after re-opening the file for
write. Instead it keeps it marked as PageUptodate() until someone calls
invaldate_inode_pages2() (typically by calling read()).

Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/nfs/write.c | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)

--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -717,6 +717,17 @@ int nfs_flush_incompatible(struct file *
}

/*
+ * If the page cache is marked as unsafe or invalid, then we can't rely on
+ * the PageUptodate() flag. In this case, we will need to turn off
+ * write optimisations that depend on the page contents being correct.
+ */
+static int nfs_write_pageuptodate(struct page *page, struct inode *inode)
+{
+ return PageUptodate(page) &&
+ !(NFS_I(inode)->cache_validity & (NFS_INO_REVAL_PAGECACHE|NFS_INO_INVALID_DATA));
+}
+
+/*
* Update and possibly write a cached page of an NFS file.
*
* XXX: Keep an eye on generic_file_read to make sure it doesn't do bad
@@ -737,10 +748,13 @@ int nfs_updatepage(struct file *file, st
(long long)(page_offset(page) +offset));

/* If we're not using byte range locks, and we know the page
- * is entirely in cache, it may be more efficient to avoid
- * fragmenting write requests.
+ * is up to date, it may be more efficient to extend the write
+ * to cover the entire page in order to avoid fragmentation
+ * inefficiencies.
*/
- if (PageUptodate(page) && inode->i_flock == NULL && !(file->f_mode & O_SYNC)) {
+ if (nfs_write_pageuptodate(page, inode) &&
+ inode->i_flock == NULL &&
+ !(file->f_mode & O_SYNC)) {
count = max(count + offset, nfs_page_length(page));
offset = 0;
}

--

2008-02-23 00:20:01

by Greg KH

[permalink] [raw]

Subject: [patch 3/8] NETFILTER: nf_conntrack_tcp: conntrack reopening fix

2.6.23-stable review patch. If anyone has any objections, please let us
know.

------------------
From: Jozsef Kadlecsik <[email protected]>

[NETFILTER]: nf_conntrack_tcp: conntrack reopening fix

[Upstream commits b2155e7f + d0c1fd7a]

TCP connection tracking in netfilter did not handle TCP reopening
properly: active close was taken into account for one side only and
not for any side, which is fixed now. The patch includes more comments
to explain the logic how the different cases are handled.
The bug was discovered by Jeff Chua.

Signed-off-by: Jozsef Kadlecsik <[email protected]>
Signed-off-by: Patrick McHardy <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/netfilter/nf_conntrack_proto_tcp.c | 35 +++++++++++++++++++++++++--------
1 file changed, 27 insertions(+), 8 deletions(-)

--- a/net/netfilter/nf_conntrack_proto_tcp.c
+++ b/net/netfilter/nf_conntrack_proto_tcp.c
@@ -135,7 +135,7 @@ enum tcp_bit_set {
* CLOSE_WAIT: ACK seen (after FIN)
* LAST_ACK: FIN seen (after FIN)
* TIME_WAIT: last ACK seen
- * CLOSE: closed connection
+ * CLOSE: closed connection (RST)
*
* LISTEN state is not used.
*
@@ -834,8 +834,21 @@ static int tcp_packet(struct nf_conn *co
case TCP_CONNTRACK_SYN_SENT:
if (old_state < TCP_CONNTRACK_TIME_WAIT)
break;
- if ((conntrack->proto.tcp.seen[!dir].flags &
- IP_CT_TCP_FLAG_CLOSE_INIT)
+ /* RFC 1122: "When a connection is closed actively,
+ * it MUST linger in TIME-WAIT state for a time 2xMSL
+ * (Maximum Segment Lifetime). However, it MAY accept
+ * a new SYN from the remote TCP to reopen the connection
+ * directly from TIME-WAIT state, if..."
+ * We ignore the conditions because we are in the
+ * TIME-WAIT state anyway.
+ *
+ * Handle aborted connections: we and the server
+ * think there is an existing connection but the client
+ * aborts it and starts a new one.
+ */
+ if (((conntrack->proto.tcp.seen[dir].flags
+ | conntrack->proto.tcp.seen[!dir].flags)
+ & IP_CT_TCP_FLAG_CLOSE_INIT)
|| (conntrack->proto.tcp.last_dir == dir
&& conntrack->proto.tcp.last_index == TCP_RST_SET)) {
/* Attempt to reopen a closed/aborted connection.
@@ -848,18 +861,25 @@ static int tcp_packet(struct nf_conn *co
}
/* Fall through */
case TCP_CONNTRACK_IGNORE:
- /* Ignored packets:
+ /* Ignored packets:
+ *
+ * Our connection entry may be out of sync, so ignore
+ * packets which may signal the real connection between
+ * the client and the server.
*
* a) SYN in ORIGINAL
* b) SYN/ACK in REPLY
* c) ACK in reply direction after initial SYN in original.
+ *
+ * If the ignored packet is invalid, the receiver will send
+ * a RST we'll catch below.
*/
if (index == TCP_SYNACK_SET
&& conntrack->proto.tcp.last_index == TCP_SYN_SET
&& conntrack->proto.tcp.last_dir != dir
&& ntohl(th->ack_seq) ==
conntrack->proto.tcp.last_end) {
- /* This SYN/ACK acknowledges a SYN that we earlier
+ /* b) This SYN/ACK acknowledges a SYN that we earlier
* ignored as invalid. This means that the client and
* the server are both in sync, while the firewall is
* not. We kill this session and block the SYN/ACK so
@@ -884,7 +904,7 @@ static int tcp_packet(struct nf_conn *co
write_unlock_bh(&tcp_lock);
if (LOG_INVALID(IPPROTO_TCP))
nf_log_packet(pf, 0, skb, NULL, NULL, NULL,
- "nf_ct_tcp: invalid packed ignored ");
+ "nf_ct_tcp: invalid packet ignored ");
return NF_ACCEPT;
case TCP_CONNTRACK_MAX:
/* Invalid packet */
@@ -938,8 +958,7 @@ static int tcp_packet(struct nf_conn *co

conntrack->proto.tcp.state = new_state;
if (old_state != new_state
- && (new_state == TCP_CONNTRACK_FIN_WAIT
- || new_state == TCP_CONNTRACK_CLOSE))
+ && new_state == TCP_CONNTRACK_FIN_WAIT)
conntrack->proto.tcp.seen[dir].flags |= IP_CT_TCP_FLAG_CLOSE_INIT;
timeout = conntrack->proto.tcp.retrans >= nf_ct_tcp_max_retrans
&& *tcp_timeouts[new_state] > nf_ct_tcp_timeout_max_retrans

--

2008-02-23 00:20:42

by Greg KH

[permalink] [raw]

Subject: [patch 4/8] hrtimer: check relative timeouts for overflow

2.6.23-stable review patch. If anyone has any objections, please let us
know.

------------------
From: Thomas Gleixner <[email protected]>

commit: 5a7780e725d1bb4c3094fcc12f1c5c5faea1e988

Various user space callers ask for relative timeouts. While we fixed
that overflow issue in hrtimer_start(), the sites which convert
relative user space values to absolute timeouts themself were uncovered.

Instead of putting overflow checks into each place add a function
which does the sanity checking and convert all affected callers to use
it.

Thanks to Frans Pop, who reported the problem and tested the fixes.

Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Tested-by: Frans Pop <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
include/linux/ktime.h | 2 ++
kernel/futex.c | 2 +-
kernel/futex_compat.c | 2 +-
kernel/hrtimer.c | 38 +++++++++++++++++++++-----------------
kernel/posix-timers.c | 8 +++++---
5 files changed, 30 insertions(+), 22 deletions(-)

--- a/include/linux/ktime.h
+++ b/include/linux/ktime.h
@@ -289,6 +289,8 @@ static inline ktime_t ktime_add_us(const
return ktime_add_ns(kt, usec * 1000);
}

+extern ktime_t ktime_add_safe(const ktime_t lhs, const ktime_t rhs);
+
/*
* The resolution of the clocks. The resolution value is returned in
* the clock_getres() system call to give application programmers an
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2063,7 +2063,7 @@ asmlinkage long sys_futex(u32 __user *ua

t = timespec_to_ktime(ts);
if (cmd == FUTEX_WAIT)
- t = ktime_add(ktime_get(), t);
+ t = ktime_add_safe(ktime_get(), t);
tp = &t;
}
/*
--- a/kernel/futex_compat.c
+++ b/kernel/futex_compat.c
@@ -175,7 +175,7 @@ asmlinkage long compat_sys_futex(u32 __u

t = timespec_to_ktime(ts);
if (cmd == FUTEX_WAIT)
- t = ktime_add(ktime_get(), t);
+ t = ktime_add_safe(ktime_get(), t);
tp = &t;
}
if (cmd == FUTEX_REQUEUE || cmd == FUTEX_CMP_REQUEUE)
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -301,6 +301,24 @@ unsigned long ktime_divns(const ktime_t
}
#endif /* BITS_PER_LONG >= 64 */

+/*
+ * Add two ktime values and do a safety check for overflow:
+ */
+
+ktime_t ktime_add_safe(const ktime_t lhs, const ktime_t rhs)
+{
+ ktime_t res = ktime_add(lhs, rhs);
+
+ /*
+ * We use KTIME_SEC_MAX here, the maximum timeout which we can
+ * return to user space in a timespec:
+ */
+ if (res.tv64 < 0 || res.tv64 < lhs.tv64 || res.tv64 < rhs.tv64)
+ res = ktime_set(KTIME_SEC_MAX, 0);
+
+ return res;
+}
+
/* High resolution timer related functions */
#ifdef CONFIG_HIGH_RES_TIMERS

@@ -658,13 +676,7 @@ hrtimer_forward(struct hrtimer *timer, k
*/
orun++;
}
- timer->expires = ktime_add(timer->expires, interval);
- /*
- * Make sure, that the result did not wrap with a very large
- * interval.
- */
- if (timer->expires.tv64 < 0)
- timer->expires = ktime_set(KTIME_SEC_MAX, 0);
+ timer->expires = ktime_add_safe(timer->expires, interval);

return orun;
}
@@ -815,7 +827,7 @@ hrtimer_start(struct hrtimer *timer, kti
new_base = switch_hrtimer_base(timer, base);

if (mode == HRTIMER_MODE_REL) {
- tim = ktime_add(tim, new_base->get_time());
+ tim = ktime_add_safe(tim, new_base->get_time());
/*
* CONFIG_TIME_LOW_RES is a temporary way for architectures
* to signal that they simply return xtime in
@@ -824,16 +836,8 @@ hrtimer_start(struct hrtimer *timer, kti
* timeouts. This will go away with the GTOD framework.
*/
#ifdef CONFIG_TIME_LOW_RES
- tim = ktime_add(tim, base->resolution);
+ tim = ktime_add_safe(tim, base->resolution);
#endif
- /*
- * Careful here: User space might have asked for a
- * very long sleep, so the add above might result in a
- * negative number, which enqueues the timer in front
- * of the queue.
- */
- if (tim.tv64 < 0)
- tim.tv64 = KTIME_MAX;
}
timer->expires = tim;

--- a/kernel/posix-timers.c
+++ b/kernel/posix-timers.c
@@ -765,9 +765,11 @@ common_timer_set(struct k_itimer *timr,
/* SIGEV_NONE timers are not queued ! See common_timer_get */
if (((timr->it_sigev_notify & ~SIGEV_THREAD_ID) == SIGEV_NONE)) {
/* Setup correct expiry time for relative timers */
- if (mode == HRTIMER_MODE_REL)
- timer->expires = ktime_add(timer->expires,
- timer->base->get_time());
+ if (mode == HRTIMER_MODE_REL) {
+ timer->expires =
+ ktime_add_safe(timer->expires,
+ timer->base->get_time());
+ }
return 0;
}

--

2008-02-23 00:21:05

by Greg KH

[permalink] [raw]

Subject: [patch 5/8] genirq: do not leave interupts enabled on free_irq

2.6.23-stable review patch. If anyone has any objections, please let us
know.

------------------
From: Thomas Gleixner <[email protected]>

commit 89d694b9dbe769ca1004e01db0ca43964806a611

The default_disable() function was changed in commit:

76d2160147f43f982dfe881404cfde9fd0a9da21
genirq: do not mask interrupts by default

It removed the mask function in favour of the default delayed
interrupt disabling. Unfortunately this also broke the shutdown in
free_irq() when the last handler is removed from the interrupt for
those architectures which rely on the default implementations. Now we
can end up with a enabled interrupt line after the last handler was
removed, which can result in spurious interrupts.

Fix this by adding a default_shutdown function, which is only
installed, when the irqchip implementation does provide neither a
shutdown nor a disable function.

Pointed-out-by: Michael Hennerich <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Tested-by: Michael Hennerich <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/irq/chip.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)

--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -246,6 +246,17 @@ static unsigned int default_startup(unsi
}

/*
+ * default shutdown function
+ */
+static void default_shutdown(unsigned int irq)
+{
+ struct irq_desc *desc = irq_desc + irq;
+
+ desc->chip->mask(irq);
+ desc->status |= IRQ_MASKED;
+}
+
+/*
* Fixup enable/disable function pointers
*/
void irq_chip_set_defaults(struct irq_chip *chip)
@@ -256,8 +267,15 @@ void irq_chip_set_defaults(struct irq_ch
chip->disable = default_disable;
if (!chip->startup)
chip->startup = default_startup;
+ /*
+ * We use chip->disable, when the user provided its own. When
+ * we have default_disable set for chip->disable, then we need
+ * to use default_shutdown, otherwise the irq line is not
+ * disabled on free_irq():
+ */
if (!chip->shutdown)
- chip->shutdown = chip->disable;
+ chip->shutdown = chip->disable != default_disable ?
+ chip->disable : default_shutdown;
if (!chip->name)
chip->name = chip->typename;
if (!chip->end)

--

2008-02-23 00:21:36

by Greg KH

[permalink] [raw]

Subject: [patch 6/8] Disable G5 NAP mode during SMU commands on U3

2.6.23-stable review patch. If anyone has any objections, please let us
know.

------------------
From: Benjamin Herrenschmidt <[email protected]>

patch 592a607bbc053bc6f614a0e619326009f4b3829e in mainline.

It appears that with the U3 northbridge, if the processor is in NAP
mode the whole time while waiting for an SMU command to complete,
then the SMU will fail. It could be related to the weird backward
mechanism the SMU uses to get to system memory via i2c to the
northbridge that doesn't operate properly when the said bridge is
in napping along with the CPU. That is on U3 at least, U4 doesn't
seem to be affected.

This didn't show before NO_HZ as the timer wakeup was enough to make
it work it seems, but that is no longer the case.

This fixes it by disabling NAP mode on those machines while
an SMU command is in flight.

Signed-off-by: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: Paul Mackerras <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/powerpc/platforms/powermac/feature.c | 11 ++++++++++-
drivers/macintosh/smu.c | 25 ++++++++++++++++++++++++-
include/asm-powerpc/pmac_feature.h | 8 ++++++++
3 files changed, 42 insertions(+), 2 deletions(-)

--- a/arch/powerpc/platforms/powermac/feature.c
+++ b/arch/powerpc/platforms/powermac/feature.c
@@ -2565,6 +2565,8 @@ static void __init probe_uninorth(void)

/* Locate core99 Uni-N */
uninorth_node = of_find_node_by_name(NULL, "uni-n");
+ uninorth_maj = 1;
+
/* Locate G5 u3 */
if (uninorth_node == NULL) {
uninorth_node = of_find_node_by_name(NULL, "u3");
@@ -2575,8 +2577,10 @@ static void __init probe_uninorth(void)
uninorth_node = of_find_node_by_name(NULL, "u4");
uninorth_maj = 4;
}
- if (uninorth_node == NULL)
+ if (uninorth_node == NULL) {
+ uninorth_maj = 0;
return;
+ }

addrp = of_get_property(uninorth_node, "reg", NULL);
if (addrp == NULL)
@@ -3029,3 +3033,8 @@ void pmac_resume_agp_for_card(struct pci
pmac_agp_resume(pmac_agp_bridge);
}
EXPORT_SYMBOL(pmac_resume_agp_for_card);
+
+int pmac_get_uninorth_variant(void)
+{
+ return uninorth_maj;
+}
--- a/drivers/macintosh/smu.c
+++ b/drivers/macintosh/smu.c
@@ -85,6 +85,7 @@ struct smu_device {
u32 cmd_buf_abs; /* command buffer absolute */
struct list_head cmd_list;
struct smu_cmd *cmd_cur; /* pending command */
+ int broken_nap;
struct list_head cmd_i2c_list;
struct smu_i2c_cmd *cmd_i2c_cur; /* pending i2c command */
struct timer_list i2c_timer;
@@ -135,6 +136,19 @@ static void smu_start_cmd(void)
fend = faddr + smu->cmd_buf->length + 2;
flush_inval_dcache_range(faddr, fend);

+
+ /* We also disable NAP mode for the duration of the command
+ * on U3 based machines.
+ * This is slightly racy as it can be written back to 1 by a sysctl
+ * but that never happens in practice. There seem to be an issue with
+ * U3 based machines such as the iMac G5 where napping for the
+ * whole duration of the command prevents the SMU from fetching it
+ * from memory. This might be related to the strange i2c based
+ * mechanism the SMU uses to access memory.
+ */
+ if (smu->broken_nap)
+ powersave_nap = 0;
+
/* This isn't exactly a DMA mapping here, I suspect
* the SMU is actually communicating with us via i2c to the
* northbridge or the CPU to access RAM.
@@ -211,6 +225,10 @@ static irqreturn_t smu_db_intr(int irq,
misc = cmd->misc;
mb();
cmd->status = rc;
+
+ /* Re-enable NAP mode */
+ if (smu->broken_nap)
+ powersave_nap = 1;
bail:
/* Start next command if any */
smu_start_cmd();
@@ -461,7 +479,7 @@ int __init smu_init (void)
if (np == NULL)
return -ENODEV;

- printk(KERN_INFO "SMU driver %s %s\n", VERSION, AUTHOR);
+ printk(KERN_INFO "SMU: Driver %s %s\n", VERSION, AUTHOR);

if (smu_cmdbuf_abs == 0) {
printk(KERN_ERR "SMU: Command buffer not allocated !\n");
@@ -533,6 +551,11 @@ int __init smu_init (void)
goto fail;
}

+ /* U3 has an issue with NAP mode when issuing SMU commands */
+ smu->broken_nap = pmac_get_uninorth_variant() < 4;
+ if (smu->broken_nap)
+ printk(KERN_INFO "SMU: using NAP mode workaround\n");
+
sys_ctrler = SYS_CTRLER_SMU;
return 0;

--- a/include/asm-powerpc/pmac_feature.h
+++ b/include/asm-powerpc/pmac_feature.h
@@ -392,6 +392,14 @@ extern u32 __iomem *uninorth_base;
#define UN_BIS(r,v) (UN_OUT((r), UN_IN(r) | (v)))
#define UN_BIC(r,v) (UN_OUT((r), UN_IN(r) & ~(v)))

+/* Uninorth variant:
+ *
+ * 0 = not uninorth
+ * 1 = U1.x or U2.x
+ * 3 = U3
+ * 4 = U4
+ */
+extern int pmac_get_uninorth_variant(void);

#endif /* __ASM_POWERPC_PMAC_FEATURE_H */
#endif /* __KERNEL__ */

--

2008-02-23 00:22:07

by Greg KH

[permalink] [raw]

Subject: [patch 7/8] Be more robust about bad arguments in get_user_pages()

2.6.23-stable review patch. If anyone has any objections, please let us
know.

------------------
From: Jonathan Corbet <[email protected]>

patch 900cf086fd2fbad07f72f4575449e0d0958f860f in mainline.

So I spent a while pounding my head against my monitor trying to figure
out the vmsplice() vulnerability - how could a failure to check for
*read* access turn into a root exploit? It turns out that it's a buffer
overflow problem which is made easy by the way get_user_pages() is
coded.

In particular, "len" is a signed int, and it is only checked at the
*end* of a do {} while() loop. So, if it is passed in as zero, the loop
will execute once and decrement len to -1. At that point, the loop will
proceed until the next invalid address is found; in the process, it will
likely overflow the pages array passed in to get_user_pages().

I think that, if get_user_pages() has been asked to grab zero pages,
that's what it should do. Thus this patch; it is, among other things,
enough to block the (already fixed) root exploit and any others which
might be lurking in similar code. I also think that the number of pages
should be unsigned, but changing the prototype of this function probably
requires some more careful review.

Signed-off-by: Jonathan Corbet <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
mm/memory.c | 2 ++
1 file changed, 2 insertions(+)

--- a/mm/memory.c
+++ b/mm/memory.c
@@ -981,6 +981,8 @@ int get_user_pages(struct task_struct *t
int i;
unsigned int vm_flags;

+ if (len <= 0)
+ return 0;
/*
* Require read or write permissions.
* If 'force' is set, we only require the "MAY" flags.

--

2008-02-23 00:22:45

by Greg KH

[permalink] [raw]

Subject: [patch 8/8] x86_64: CPA, fix cache attribute inconsistency bug

2.6.23-stable review patch. If anyone has any objections, please let us
know.

------------------
From: Ingo Molnar <[email protected]>

no upstream git id as the code has been rewritten.

fix CPA cache attribute bug in v2.6.23. When phys_base is nonzero
(when CONFIG_RELOCATABLE=y) then change_page_attr_addr() miscalculates
the secondary alias address by -14 MB (depending on the configured
offset).

The default 64-bit kernels of Fedora and Ubuntu are affected:

$ grep RELOCA /boot/config-2.6.23.9-85.fc8
CONFIG_RELOCATABLE=y

$ grep RELOC /boot/config-2.6.22-14-generic
CONFIG_RELOCATABLE=y

and probably on many other distros as well.

the bug affects all pages in the first 40 MB of physical RAM that
are allocated by some subsystem that does ioremap_nocache() on them:

if (__pa(address) < KERNEL_TEXT_SIZE) {

Hence we might leave page table entries with inconsistent cache
attributes around (pages mapped at both UnCacheable and Write-Back),
and we can also set the wrong kernel text pages to UnCacheable.

the effects of this bug can be random slowdowns and other misbehavior.
If for example AGP allocates its aperture pages into the first 40 MB
of physical RAM, then the -14 MB bug might mark random kernel texto
pages as uncacheable, slowing down a random portion of the 64-bit
kernel until the AGP driver is unloaded.

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Thomas Gleixner <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86_64/mm/pageattr.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86_64/mm/pageattr.c
+++ b/arch/x86_64/mm/pageattr.c
@@ -207,7 +207,7 @@ int change_page_attr_addr(unsigned long
if (__pa(address) < KERNEL_TEXT_SIZE) {
unsigned long addr2;
pgprot_t prot2;
- addr2 = __START_KERNEL_map + __pa(address);
+ addr2 = __START_KERNEL_map + __pa(address) - phys_base;
/* Make sure the kernel mappings stay executable */
prot2 = pte_pgprot(pte_mkexec(pfn_pte(0, prot)));
err = __change_page_attr(addr2, pfn, prot2,

--

2008-02-23 16:57:39

by Chuck Ebbert

[permalink] [raw]

Subject: Re: [patch 0/8] 2.6.23-stable review

On 02/22/2008 07:17 PM, Greg KH wrote:
> This is the start of the stable review cycle for the 2.6.23.17 release.
> There are 8 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let us know. If anyone is a maintainer of the proper subsystem, and
> wants to add a Signed-off-by: line to the patch, please respond with it.
>
> These patches are sent out with a number of different people on the
> Cc: line. If you wish to be a reviewer, please email [email protected]
> to add your name to the list. If you want to be off the reviewer list,
> also email us.
>
> Responses should be made by Tuesday, Feb 25, 2008, 00:10:00 UTC.
> Anything received after that time might be too late.
>

Still missing this one? (trivial backport)

Gitweb: http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9d55b9923a1b7ea8193b8875c57ec940dc2ff027
Commit: 9d55b9923a1b7ea8193b8875c57ec940dc2ff027
Parent: 5df7fa1c62146a0933767d040d400013310dbcc7
Author: Thomas Gleixner <[email protected]>
AuthorDate: Fri Feb 1 17:45:14 2008 +0100
Committer: Ingo Molnar <[email protected]>
CommitDate: Fri Feb 1 17:45:14 2008 +0100

x86: replace LOCK_PREFIX in futex.h

The exception fixup for the futex macros __futex_atomic_op1/2 and
futex_atomic_cmpxchg_inatomic() is missing an entry when the lock
prefix is replaced by a NOP via SMP alternatives.

Chuck Ebert tracked this down from the information provided in:
https://bugzilla.redhat.com/show_bug.cgi?id=429412

A possible solution would be to add another fixup after the
LOCK_PREFIX, so both the LOCK and NOP case have their own entry in the
exception table, but it's not really worth the trouble.

Simply replace LOCK_PREFIX with lock and keep those untouched by SMP
alternatives.

Signed-off-by: Thomas Gleixner <[email protected]>

Signed-off-by: Ingo Molnar <[email protected]>

[[email protected]: backport to 2.6.23]

---
include/asm-i386/futex.h | 6 +++---
include/asm-x86_64/futex.h | 6 +++---
2 files changed, 6 insertions(+), 6 deletions(-)

--- vanilla.orig/include/asm-i386/futex.h
+++ vanilla/include/asm-i386/futex.h
@@ -28,7 +28,7 @@
"1: movl %2, %0\n\
movl %0, %3\n" \
insn "\n" \
-"2: " LOCK_PREFIX "cmpxchgl %3, %2\n\
+"2: lock ; cmpxchgl %3, %2\n\
jnz 1b\n\
3: .section .fixup,\"ax\"\n\
4: mov %5, %1\n\
@@ -68,7 +68,7 @@ futex_atomic_op_inuser (int encoded_op,
#endif
switch (op) {
case FUTEX_OP_ADD:
- __futex_atomic_op1(LOCK_PREFIX "xaddl %0, %2", ret,
+ __futex_atomic_op1("lock ; xaddl %0, %2", ret,
oldval, uaddr, oparg);
break;
case FUTEX_OP_OR:
@@ -111,7 +111,7 @@ futex_atomic_cmpxchg_inatomic(int __user
return -EFAULT;

__asm__ __volatile__(
- "1: " LOCK_PREFIX "cmpxchgl %3, %1 \n"
+ "1: lock ; cmpxchgl %3, %1 \n"

"2: .section .fixup, \"ax\" \n"
"3: mov %2, %0 \n"
--- vanilla.orig/include/asm-x86_64/futex.h
+++ vanilla/include/asm-x86_64/futex.h
@@ -27,7 +27,7 @@
"1: movl %2, %0\n\
movl %0, %3\n" \
insn "\n" \
-"2: " LOCK_PREFIX "cmpxchgl %3, %2\n\
+"2: lock ; cmpxchgl %3, %2\n\
jnz 1b\n\
3: .section .fixup,\"ax\"\n\
4: mov %5, %1\n\
@@ -62,7 +62,7 @@ futex_atomic_op_inuser (int encoded_op,
__futex_atomic_op1("xchgl %0, %2", ret, oldval, uaddr, oparg);
break;
case FUTEX_OP_ADD:
- __futex_atomic_op1(LOCK_PREFIX "xaddl %0, %2", ret, oldval,
+ __futex_atomic_op1("lock ; xaddl %0, %2", ret, oldval,
uaddr, oparg);
break;
case FUTEX_OP_OR:
@@ -101,7 +101,7 @@ futex_atomic_cmpxchg_inatomic(int __user
return -EFAULT;

__asm__ __volatile__(
- "1: " LOCK_PREFIX "cmpxchgl %3, %1 \n"
+ "1: lock ; cmpxchgl %3, %1 \n"

"2: .section .fixup, \"ax\" \n"
"3: mov %2, %0 \n"