2010-12-08 00:34:28

by Greg KH

[permalink] [raw]
Subject: [00/44] 2.6.27.57-stable review


This is the start of the stable review cycle for the 2.6.27.57 release.
There are 44 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let us know. If anyone is a maintainer of the proper subsystem, and
wants to add a Signed-off-by: line to the patch, please respond with it.

Responses should be made by Thursday, December 9, 2010, 20:00:00 UTC.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v2.6/stable-review/patch-2.6.27.57-rc1.gz
and the diffstat can be found below.

Note, this is going to be my last .27 kernel release before handing it
off to someone else for "longterm" support.

thanks,

greg k-h


Makefile | 2 +-
arch/arm/lib/findbit.S | 6 ++-
arch/um/os-Linux/time.c | 2 +-
arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c | 1 +
block/blk-map.c | 2 +
drivers/ata/libata-scsi.c | 5 ++-
drivers/char/vt_ioctl.c | 11 +++---
drivers/media/video/ivtv/ivtvfb.c | 2 +
drivers/usb/atm/ueagle-atm.c | 6 ++--
drivers/usb/core/devio.c | 7 ++--
drivers/usb/host/ehci-hcd.c | 10 +++---
drivers/usb/misc/cypress_cy7c63.c | 6 +--
drivers/usb/misc/iowarrior.c | 1 +
drivers/usb/misc/sisusbvga/sisusb.c | 1 +
drivers/usb/misc/trancevibrator.c | 2 +-
drivers/usb/misc/usbled.c | 2 +-
drivers/usb/storage/sierra_ms.c | 2 +-
fs/bio.c | 14 ++++++++-
fs/ecryptfs/inode.c | 4 ++
include/net/x25.h | 4 ++
ipc/compat.c | 6 +++
ipc/compat_mq.c | 5 +++
ipc/sem.c | 2 +
ipc/shm.c | 1 +
kernel/exit.c | 9 +++++
lib/percpu_counter.c | 1 +
mm/internal.h | 2 +-
mm/memory_hotplug.c | 2 +-
mm/mempolicy.c | 2 +-
net/can/bcm.c | 2 +-
net/core/ethtool.c | 2 +-
net/core/stream.c | 8 ++--
net/decnet/af_decnet.c | 2 +
net/econet/af_econet.c | 29 ++++++----------
net/ipv4/tcp.c | 11 +++++--
net/ipv4/tcp_input.c | 2 +
net/ipv4/udp.c | 4 ++-
net/ipv4/xfrm4_policy.c | 4 +-
net/ipv6/netfilter/nf_conntrack_reasm.c | 1 +
net/ipv6/route.c | 28 ++++++++++++++--
net/irda/iriap.c | 3 +-
net/irda/parameters.c | 4 ++-
net/rose/af_rose.c | 4 +-
net/sctp/protocol.c | 4 ++-
net/socket.c | 4 ++
net/x25/af_x25.c | 47 +++++++++++++++++++++++++++-
net/x25/x25_facilities.c | 32 ++++++++++++++-----
net/x25/x25_in.c | 17 ++++++++--
48 files changed, 244 insertions(+), 84 deletions(-)


2010-12-08 00:34:18

by Greg KH

[permalink] [raw]
Subject: [03/44] irda: Fix heap memory corruption in iriap.c

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Samuel Ortiz <[email protected]>

commit 37f9fc452d138dfc4da2ee1ce5ae85094efc3606 upstream.

While parsing the GetValuebyClass command frame, we could potentially write
passed the skb->data pointer.

Reported-by: Ilja Van Sprundel <[email protected]>
Signed-off-by: Samuel Ortiz <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/irda/iriap.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/net/irda/iriap.c
+++ b/net/irda/iriap.c
@@ -501,7 +501,8 @@ static void iriap_getvaluebyclass_confir
IRDA_DEBUG(4, "%s(), strlen=%d\n", __func__, value_len);

/* Make sure the string is null-terminated */
- fp[n+value_len] = 0x00;
+ if (n + value_len < skb->len)
+ fp[n + value_len] = 0x00;
IRDA_DEBUG(4, "Got string %s\n", fp+n);

/* Will truncate to IAS_MAX_STRING bytes */

2010-12-08 00:34:16

by Greg KH

[permalink] [raw]
Subject: [01/44] block: check for proper length of iov entries in blk_rq_map_user_iov()

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Jens Axboe <[email protected]>

commit 9284bcf4e335e5f18a8bc7b26461c33ab60d0689 upstream.

Ensure that we pass down properly validated iov segments before
calling into the mapping or copy functions.

Reported-by: Dan Rosenberg <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
block/blk-map.c | 2 ++
1 file changed, 2 insertions(+)

--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -191,6 +191,8 @@ int blk_rq_map_user_iov(struct request_q
unaligned = 1;
break;
}
+ if (!iov[i].iov_len)
+ return -EINVAL;
}

if (unaligned || (q->dma_pad_mask & len))

2010-12-08 00:34:31

by Greg KH

[permalink] [raw]
Subject: [16/44] usb: misc: sisusbvga: fix information leak to userland

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Vasiliy Kulikov <[email protected]>

commit 5dc92cf1d0b4b0debbd2e333b83f9746c103533d upstream.

Structure sisusb_info is copied to userland with "sisusb_reserved" field
uninitialized. It leads to leaking of contents of kernel stack memory.

Signed-off-by: Vasiliy Kulikov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/usb/misc/sisusbvga/sisusb.c | 1 +
1 file changed, 1 insertion(+)

--- a/drivers/usb/misc/sisusbvga/sisusb.c
+++ b/drivers/usb/misc/sisusbvga/sisusb.c
@@ -3031,6 +3031,7 @@ sisusb_ioctl(struct file *file, unsigned
#else
x.sisusb_conactive = 0;
#endif
+ memset(x.sisusb_reserved, 0, sizeof(x.sisusb_reserved));

if (copy_to_user((void __user *)arg, &x, sizeof(x)))
retval = -EFAULT;

2010-12-08 00:34:45

by Greg KH

[permalink] [raw]
Subject: [31/44] net: Fix IPv6 PMTU disc. w/ asymmetric routes

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Maciej Żenczykowski <[email protected]>

[ Upstream commit ae878ae280bea286ff2b1e1cb6e609dd8cb4501d ]

Signed-off-by: Maciej Żenczykowski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv6/route.c | 28 ++++++++++++++++++++++++----
1 file changed, 24 insertions(+), 4 deletions(-)

--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1507,14 +1507,13 @@ out:
* i.e. Path MTU discovery
*/

-void rt6_pmtu_discovery(struct in6_addr *daddr, struct in6_addr *saddr,
- struct net_device *dev, u32 pmtu)
+static void rt6_do_pmtu_disc(struct in6_addr *daddr, struct in6_addr *saddr,
+ struct net *net, u32 pmtu, int ifindex)
{
struct rt6_info *rt, *nrt;
- struct net *net = dev_net(dev);
int allfrag = 0;

- rt = rt6_lookup(net, daddr, saddr, dev->ifindex, 0);
+ rt = rt6_lookup(net, daddr, saddr, ifindex, 0);
if (rt == NULL)
return;

@@ -1582,6 +1581,27 @@ out:
dst_release(&rt->u.dst);
}

+void rt6_pmtu_discovery(struct in6_addr *daddr, struct in6_addr *saddr,
+ struct net_device *dev, u32 pmtu)
+{
+ struct net *net = dev_net(dev);
+
+ /*
+ * RFC 1981 states that a node "MUST reduce the size of the packets it
+ * is sending along the path" that caused the Packet Too Big message.
+ * Since it's not possible in the general case to determine which
+ * interface was used to send the original packet, we update the MTU
+ * on the interface that will be used to send future packets. We also
+ * update the MTU on the interface that received the Packet Too Big in
+ * case the original packet was forced out that interface with
+ * SO_BINDTODEVICE or similar. This is the next best thing to the
+ * correct behaviour, which would be to update the MTU on all
+ * interfaces.
+ */
+ rt6_do_pmtu_disc(daddr, saddr, net, pmtu, 0);
+ rt6_do_pmtu_disc(daddr, saddr, net, pmtu, dev->ifindex);
+}
+
/*
* Misc support functions
*/

2010-12-08 00:34:57

by Greg KH

[permalink] [raw]
Subject: [39/44] memory corruption in X.25 facilities parsing

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: andrew hendry <[email protected]>

commit a6331d6f9a4298173b413cf99a40cc86a9d92c37 upstream.

Signed-of-by: Andrew Hendry <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/x25/x25_facilities.c | 8 ++++----
net/x25/x25_in.c | 2 ++
2 files changed, 6 insertions(+), 4 deletions(-)

--- a/net/x25/x25_facilities.c
+++ b/net/x25/x25_facilities.c
@@ -134,15 +134,15 @@ int x25_parse_facilities(struct sk_buff
case X25_FAC_CLASS_D:
switch (*p) {
case X25_FAC_CALLING_AE:
- if (p[1] > X25_MAX_DTE_FACIL_LEN)
- break;
+ if (p[1] > X25_MAX_DTE_FACIL_LEN || p[1] <= 1)
+ return 0;
dte_facs->calling_len = p[2];
memcpy(dte_facs->calling_ae, &p[3], p[1] - 1);
*vc_fac_mask |= X25_MASK_CALLING_AE;
break;
case X25_FAC_CALLED_AE:
- if (p[1] > X25_MAX_DTE_FACIL_LEN)
- break;
+ if (p[1] > X25_MAX_DTE_FACIL_LEN || p[1] <= 1)
+ return 0;
dte_facs->called_len = p[2];
memcpy(dte_facs->called_ae, &p[3], p[1] - 1);
*vc_fac_mask |= X25_MASK_CALLED_AE;
--- a/net/x25/x25_in.c
+++ b/net/x25/x25_in.c
@@ -118,6 +118,8 @@ static int x25_state1_machine(struct soc
&x25->vc_facil_mask);
if (len > 0)
skb_pull(skb, len);
+ else
+ return -1;
/*
* Copy any Call User Data.
*/

2010-12-08 00:34:37

by Greg KH

[permalink] [raw]
Subject: [23/44] USB: misc: usbled: fix up some sysfs attribute permissions

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Greg Kroah-Hartman <[email protected]>

commit 48f115470e68d443436b76b22dad63ffbffd6b97 upstream.

They should not be writable by any user.

Reported-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/usb/misc/usbled.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/usb/misc/usbled.c
+++ b/drivers/usb/misc/usbled.c
@@ -94,7 +94,7 @@ static ssize_t set_##value(struct device
change_color(led); \
return count; \
} \
-static DEVICE_ATTR(value, S_IWUGO | S_IRUGO, show_##value, set_##value);
+static DEVICE_ATTR(value, S_IRUGO | S_IWUSR, show_##value, set_##value);
show_set(blue);
show_set(red);
show_set(green);

2010-12-08 00:35:46

by Greg KH

[permalink] [raw]
Subject: [43/44] econet: disallow NULL remote addr for sendmsg(), fixes CVE-2010-3849

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Phil Blundell <[email protected]>

commit fa0e846494792e722d817b9d3d625a4ef4896c96 upstream.

Later parts of econet_sendmsg() rely on saddr != NULL, so return early
with EINVAL if NULL was passed otherwise an oops may occur.

Signed-off-by: Phil Blundell <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/econet/af_econet.c | 26 ++++++++------------------
1 file changed, 8 insertions(+), 18 deletions(-)

--- a/net/econet/af_econet.c
+++ b/net/econet/af_econet.c
@@ -296,23 +296,14 @@ static int econet_sendmsg(struct kiocb *

mutex_lock(&econet_mutex);

- if (saddr == NULL) {
- struct econet_sock *eo = ec_sk(sk);
-
- addr.station = eo->station;
- addr.net = eo->net;
- port = eo->port;
- cb = eo->cb;
- } else {
- if (msg->msg_namelen < sizeof(struct sockaddr_ec)) {
- mutex_unlock(&econet_mutex);
- return -EINVAL;
- }
- addr.station = saddr->addr.station;
- addr.net = saddr->addr.net;
- port = saddr->port;
- cb = saddr->cb;
- }
+ if (saddr == NULL || msg->msg_namelen < sizeof(struct sockaddr_ec)) {
+ mutex_unlock(&econet_mutex);
+ return -EINVAL;
+ }
+ addr.station = saddr->addr.station;
+ addr.net = saddr->addr.net;
+ port = saddr->port;
+ cb = saddr->cb;

/* Look for a device with the right network number. */
dev = net2dev_map[addr.net];
@@ -350,7 +341,6 @@ static int econet_sendmsg(struct kiocb *

eb = (struct ec_cb *)&skb->cb;

- /* BUG: saddr may be NULL */
eb->cookie = saddr->cookie;
eb->sec = *saddr;
eb->sent = ec_tx_done;

2010-12-08 00:35:00

by Greg KH

[permalink] [raw]
Subject: [36/44] net: Truncate recvfrom and sendto length to INT_MAX.

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Linus Torvalds <[email protected]>

commit 253eacc070b114c2ec1f81b067d2fed7305467b0 upstream.

Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/socket.c | 4 ++++
1 file changed, 4 insertions(+)

--- a/net/socket.c
+++ b/net/socket.c
@@ -1691,6 +1691,8 @@ SYSCALL_DEFINE6(sendto, int, fd, void __
struct iovec iov;
int fput_needed;

+ if (len > INT_MAX)
+ len = INT_MAX;
sock = sockfd_lookup_light(fd, &err, &fput_needed);
if (!sock)
goto out;
@@ -1748,6 +1750,8 @@ SYSCALL_DEFINE6(recvfrom, int, fd, void
int err, err2;
int fput_needed;

+ if (size > INT_MAX)
+ size = INT_MAX;
sock = sockfd_lookup_light(fd, &err, &fput_needed);
if (!sock)
goto out;

2010-12-08 00:34:42

by Greg KH

[permalink] [raw]
Subject: [28/44] ARM: 6482/2: Fix find_next_zero_bit and related assembly

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: James Jones <[email protected]>

commit 0e91ec0c06d2cd15071a6021c94840a50e6671aa upstream.

The find_next_bit, find_first_bit, find_next_zero_bit
and find_first_zero_bit functions were not properly
clamping to the maxbit argument at the bit level. They
were instead only checking maxbit at the byte level.
To fix this, add a compare and a conditional move
instruction to the end of the common bit-within-the-
byte code used by all the functions and be sure not to
clobber the maxbit argument before it is used.

Reviewed-by: Nicolas Pitre <[email protected]>
Tested-by: Stephen Warren <[email protected]>
Signed-off-by: James Jones <[email protected]>
Signed-off-by: Russell King <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/lib/findbit.S | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

--- a/arch/arm/lib/findbit.S
+++ b/arch/arm/lib/findbit.S
@@ -148,8 +148,8 @@ ENTRY(_find_next_bit_be)
*/
.L_found:
#if __LINUX_ARM_ARCH__ >= 5
- rsb r1, r3, #0
- and r3, r3, r1
+ rsb r0, r3, #0
+ and r3, r3, r0
clz r3, r3
rsb r3, r3, #31
add r0, r2, r3
@@ -164,5 +164,7 @@ ENTRY(_find_next_bit_be)
addeq r2, r2, #1
mov r0, r2
#endif
+ cmp r1, r0 @ Clamp to maxbit
+ movlo r0, r1
mov pc, lr


2010-12-08 00:34:40

by Greg KH

[permalink] [raw]
Subject: [25/44] acpi-cpufreq: fix a memleak when unloading driver

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Zhang Rui <[email protected]>

commit dab5fff14df2cd16eb1ad4c02e83915e1063fece upstream.

We didn't free per_cpu(acfreq_data, cpu)->freq_table
when acpi_freq driver is unloaded.

Resulting in the following messages in /sys/kernel/debug/kmemleak:

unreferenced object 0xf6450e80 (size 64):
comm "modprobe", pid 1066, jiffies 4294677317 (age 19290.453s)
hex dump (first 32 bytes):
00 00 00 00 e8 a2 24 00 01 00 00 00 00 9f 24 00 ......$.......$.
02 00 00 00 00 6a 18 00 03 00 00 00 00 35 0c 00 .....j.......5..
backtrace:
[<c123ba97>] kmemleak_alloc+0x27/0x50
[<c109f96f>] __kmalloc+0xcf/0x110
[<f9da97ee>] acpi_cpufreq_cpu_init+0x1ee/0x4e4 [acpi_cpufreq]
[<c11cd8d2>] cpufreq_add_dev+0x142/0x3a0
[<c11920b7>] sysdev_driver_register+0x97/0x110
[<c11cce56>] cpufreq_register_driver+0x86/0x140
[<f9dad080>] 0xf9dad080
[<c1001130>] do_one_initcall+0x30/0x160
[<c10626e9>] sys_init_module+0x99/0x1e0
[<c1002d97>] sysenter_do_call+0x12/0x26
[<ffffffff>] 0xffffffff

https://bugzilla.kernel.org/show_bug.cgi?id=15807#c21

Tested-by: Toralf Forster <[email protected]>
Signed-off-by: Zhang Rui <[email protected]>
Signed-off-by: Len Brown <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c | 1 +
1 file changed, 1 insertion(+)

--- a/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
+++ b/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
@@ -742,6 +742,7 @@ static int acpi_cpufreq_cpu_exit(struct
per_cpu(drv_data, policy->cpu) = NULL;
acpi_processor_unregister_performance(data->acpi_data,
policy->cpu);
+ kfree(data->freq_table);
kfree(data);
}


2010-12-08 00:35:18

by Greg KH

[permalink] [raw]
Subject: [44/44] econet: fix CVE-2010-3850

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Phil Blundell <[email protected]>

commit 16c41745c7b92a243d0874f534c1655196c64b74 upstream.

Add missing check for capable(CAP_NET_ADMIN) in SIOCSIFADDR operation.

Signed-off-by: Phil Blundell <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/econet/af_econet.c | 3 +++
1 file changed, 3 insertions(+)

--- a/net/econet/af_econet.c
+++ b/net/econet/af_econet.c
@@ -661,6 +661,9 @@ static int ec_dev_ioctl(struct socket *s
err = 0;
switch (cmd) {
case SIOCSIFADDR:
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
edev = dev->ec_ptr;
if (edev == NULL) {
/* Magic up a new one. */

2010-12-08 00:34:58

by Greg KH

[permalink] [raw]
Subject: [37/44] ipv6: conntrack: Add member of user to nf_ct_frag6_queue structure

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Shan Wei <[email protected]>

commit c92b544bd5d8e7ed7d81c77bbecab6df2a95aa53 upstream.

The commit 0b5ccb2(title:ipv6: reassembly: use seperate reassembly queues for
conntrack and local delivery) has broken the saddr&&daddr member of
nf_ct_frag6_queue when creating new queue. And then hash value
generated by nf_hashfn() was not equal with that generated by fq_find().
So, a new received fragment can't be inserted to right queue.

The patch fixes the bug with adding member of user to nf_ct_frag6_queue structure.

Signed-off-by: Shan Wei <[email protected]>
Acked-by: Patrick McHardy <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Cc: Pascal Hambourg <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/ipv6/netfilter/nf_conntrack_reasm.c | 1 +
1 file changed, 1 insertion(+)

--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -64,6 +64,7 @@ struct nf_ct_frag6_queue
struct inet_frag_queue q;

__be32 id; /* fragment id */
+ u32 user;
struct in6_addr saddr;
struct in6_addr daddr;


2010-12-08 00:34:56

by Greg KH

[permalink] [raw]
Subject: [38/44] x25: Patch to fix bug 15678 - x25 accesses fields beyond end of packet.

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: John Hughes <[email protected]>

commit f5eb917b861828da18dc28854308068c66d1449a upstream.

Here is a patch to stop X.25 examining fields beyond the end of the packet.

For example, when a simple CALL ACCEPTED was received:

10 10 0f

x25_parse_facilities was attempting to decode the FACILITIES field, but this
packet contains no facilities field.

Signed-off-by: John Hughes <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
include/net/x25.h | 4 ++++
net/x25/af_x25.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++-
net/x25/x25_facilities.c | 12 +++++++++++-
net/x25/x25_in.c | 15 +++++++++++----
4 files changed, 72 insertions(+), 6 deletions(-)

--- a/include/net/x25.h
+++ b/include/net/x25.h
@@ -182,6 +182,10 @@ extern int sysctl_x25_clear_request_tim
extern int sysctl_x25_ack_holdback_timeout;
extern int sysctl_x25_forward;

+extern int x25_parse_address_block(struct sk_buff *skb,
+ struct x25_address *called_addr,
+ struct x25_address *calling_addr);
+
extern int x25_addr_ntoa(unsigned char *, struct x25_address *,
struct x25_address *);
extern int x25_addr_aton(unsigned char *, struct x25_address *,
--- a/net/x25/af_x25.c
+++ b/net/x25/af_x25.c
@@ -80,6 +80,41 @@ struct compat_x25_subscrip_struct {
};
#endif

+
+int x25_parse_address_block(struct sk_buff *skb,
+ struct x25_address *called_addr,
+ struct x25_address *calling_addr)
+{
+ unsigned char len;
+ int needed;
+ int rc;
+
+ if (skb->len < 1) {
+ /* packet has no address block */
+ rc = 0;
+ goto empty;
+ }
+
+ len = *skb->data;
+ needed = 1 + (len >> 4) + (len & 0x0f);
+
+ if (skb->len < needed) {
+ /* packet is too short to hold the addresses it claims
+ to hold */
+ rc = -1;
+ goto empty;
+ }
+
+ return x25_addr_ntoa(skb->data, called_addr, calling_addr);
+
+empty:
+ *called_addr->x25_addr = 0;
+ *calling_addr->x25_addr = 0;
+
+ return rc;
+}
+
+
int x25_addr_ntoa(unsigned char *p, struct x25_address *called_addr,
struct x25_address *calling_addr)
{
@@ -871,16 +906,26 @@ int x25_rx_call_request(struct sk_buff *
/*
* Extract the X.25 addresses and convert them to ASCII strings,
* and remove them.
+ *
+ * Address block is mandatory in call request packets
*/
- addr_len = x25_addr_ntoa(skb->data, &source_addr, &dest_addr);
+ addr_len = x25_parse_address_block(skb, &source_addr, &dest_addr);
+ if (addr_len <= 0)
+ goto out_clear_request;
skb_pull(skb, addr_len);

/*
* Get the length of the facilities, skip past them for the moment
* get the call user data because this is needed to determine
* the correct listener
+ *
+ * Facilities length is mandatory in call request packets
*/
+ if (skb->len < 1)
+ goto out_clear_request;
len = skb->data[0] + 1;
+ if (skb->len < len)
+ goto out_clear_request;
skb_pull(skb,len);

/*
--- a/net/x25/x25_facilities.c
+++ b/net/x25/x25_facilities.c
@@ -35,7 +35,7 @@ int x25_parse_facilities(struct sk_buff
struct x25_dte_facilities *dte_facs, unsigned long *vc_fac_mask)
{
unsigned char *p = skb->data;
- unsigned int len = *p++;
+ unsigned int len;

*vc_fac_mask = 0;

@@ -50,6 +50,14 @@ int x25_parse_facilities(struct sk_buff
memset(dte_facs->called_ae, '\0', sizeof(dte_facs->called_ae));
memset(dte_facs->calling_ae, '\0', sizeof(dte_facs->calling_ae));

+ if (skb->len < 1)
+ return 0;
+
+ len = *p++;
+
+ if (len >= skb->len)
+ return -1;
+
while (len > 0) {
switch (*p & X25_FAC_CLASS_MASK) {
case X25_FAC_CLASS_A:
@@ -247,6 +255,8 @@ int x25_negotiate_facilities(struct sk_b
memcpy(new, ours, sizeof(*new));

len = x25_parse_facilities(skb, &theirs, dte, &x25->vc_facil_mask);
+ if (len < 0)
+ return len;

/*
* They want reverse charging, we won't accept it.
--- a/net/x25/x25_in.c
+++ b/net/x25/x25_in.c
@@ -89,6 +89,7 @@ static int x25_queue_rx_frame(struct soc
static int x25_state1_machine(struct sock *sk, struct sk_buff *skb, int frametype)
{
struct x25_address source_addr, dest_addr;
+ int len;

switch (frametype) {
case X25_CALL_ACCEPTED: {
@@ -106,11 +107,17 @@ static int x25_state1_machine(struct soc
* Parse the data in the frame.
*/
skb_pull(skb, X25_STD_MIN_LEN);
- skb_pull(skb, x25_addr_ntoa(skb->data, &source_addr, &dest_addr));
- skb_pull(skb,
- x25_parse_facilities(skb, &x25->facilities,
+
+ len = x25_parse_address_block(skb, &source_addr,
+ &dest_addr);
+ if (len > 0)
+ skb_pull(skb, len);
+
+ len = x25_parse_facilities(skb, &x25->facilities,
&x25->dte_facilities,
- &x25->vc_facil_mask));
+ &x25->vc_facil_mask);
+ if (len > 0)
+ skb_pull(skb, len);
/*
* Copy any Call User Data.
*/

2010-12-08 00:36:24

by Greg KH

[permalink] [raw]
Subject: [41/44] V4L/DVB: ivtvfb: prevent reading uninitialized stack memory

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Dan Rosenberg <[email protected]>

commit 405707985594169cfd0b1d97d29fcb4b4c6f2ac9 upstream.

The FBIOGET_VBLANK device ioctl allows unprivileged users to read 16
bytes of uninitialized stack memory, because the "reserved" member of
the fb_vblank struct declared on the stack is not altered or zeroed
before being copied back to the user. This patch takes care of it.

Signed-off-by: Dan Rosenberg <[email protected]>
Signed-off-by: Andy Walls <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/media/video/ivtv/ivtvfb.c | 2 ++
1 file changed, 2 insertions(+)

--- a/drivers/media/video/ivtv/ivtvfb.c
+++ b/drivers/media/video/ivtv/ivtvfb.c
@@ -460,6 +460,8 @@ static int ivtvfb_ioctl(struct fb_info *
struct fb_vblank vblank;
u32 trace;

+ memset(&vblank, 0, sizeof(struct fb_vblank));
+
vblank.flags = FB_VBLANK_HAVE_COUNT |FB_VBLANK_HAVE_VCOUNT |
FB_VBLANK_HAVE_VSYNC;
trace = read_reg(0x028c0) >> 16;

2010-12-08 00:36:41

by Greg KH

[permalink] [raw]
Subject: [40/44] can-bcm: fix minor heap overflow

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Oliver Hartkopp <[email protected]>

commit 0597d1b99fcfc2c0eada09a698f85ed413d4ba84 upstream.

On 64-bit platforms the ASCII representation of a pointer may be up to 17
bytes long. This patch increases the length of the buffer accordingly.

http://marc.info/?l=linux-netdev&m=128872251418192&w=2

Reported-by: Dan Rosenberg <[email protected]>
Signed-off-by: Oliver Hartkopp <[email protected]>
CC: Linus Torvalds <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/can/bcm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -121,7 +121,7 @@ struct bcm_sock {
struct list_head tx_ops;
unsigned long dropped_usr_msgs;
struct proc_dir_entry *bcm_proc_read;
- char procname [9]; /* pointer printed in ASCII with \0 */
+ char procname [20]; /* pointer printed in ASCII with \0 */
};

static inline struct bcm_sock *bcm_sk(const struct sock *sk)

2010-12-08 00:36:00

by Greg KH

[permalink] [raw]
Subject: [42/44] x25: Prevent crashing when parsing bad X.25 facilities

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Dan Rosenberg <[email protected]>

commit 5ef41308f94dcbb3b7afc56cdef1c2ba53fa5d2f upstream.

Now with improved comma support.

On parsing malformed X.25 facilities, decrementing the remaining length
may cause it to underflow. Since the length is an unsigned integer,
this will result in the loop continuing until the kernel crashes.

This patch adds checks to ensure decrementing the remaining length does
not cause it to wrap around.

Signed-off-by: Dan Rosenberg <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/x25/x25_facilities.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)

--- a/net/x25/x25_facilities.c
+++ b/net/x25/x25_facilities.c
@@ -61,6 +61,8 @@ int x25_parse_facilities(struct sk_buff
while (len > 0) {
switch (*p & X25_FAC_CLASS_MASK) {
case X25_FAC_CLASS_A:
+ if (len < 2)
+ return 0;
switch (*p) {
case X25_FAC_REVERSE:
if((p[1] & 0x81) == 0x81) {
@@ -104,6 +106,8 @@ int x25_parse_facilities(struct sk_buff
len -= 2;
break;
case X25_FAC_CLASS_B:
+ if (len < 3)
+ return 0;
switch (*p) {
case X25_FAC_PACKET_SIZE:
facilities->pacsize_in = p[1];
@@ -125,6 +129,8 @@ int x25_parse_facilities(struct sk_buff
len -= 3;
break;
case X25_FAC_CLASS_C:
+ if (len < 4)
+ return 0;
printk(KERN_DEBUG "X.25: unknown facility %02X, "
"values %02X, %02X, %02X\n",
p[0], p[1], p[2], p[3]);
@@ -132,6 +138,8 @@ int x25_parse_facilities(struct sk_buff
len -= 4;
break;
case X25_FAC_CLASS_D:
+ if (len < p[1] + 2)
+ return 0;
switch (*p) {
case X25_FAC_CALLING_AE:
if (p[1] > X25_MAX_DTE_FACIL_LEN || p[1] <= 1)
@@ -149,9 +157,7 @@ int x25_parse_facilities(struct sk_buff
break;
default:
printk(KERN_DEBUG "X.25: unknown facility %02X,"
- "length %d, values %02X, %02X, "
- "%02X, %02X\n",
- p[0], p[1], p[2], p[3], p[4], p[5]);
+ "length %d\n", p[0], p[1]);
break;
}
len -= p[1] + 2;

2010-12-08 00:34:49

by Greg KH

[permalink] [raw]
Subject: [32/44] rose: Fix signedness issues wrt. digi count.

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------


From: David S. Miller <[email protected]>

[ Upstream commit 9828e6e6e3f19efcb476c567b9999891d051f52f ]

Just use explicit casts, since we really can't change the
types of structures exported to userspace which have been
around for 15 years or so.

Reported-by: Dan Rosenberg <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/rose/af_rose.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/net/rose/af_rose.c
+++ b/net/rose/af_rose.c
@@ -680,7 +680,7 @@ static int rose_bind(struct socket *sock
if (addr_len == sizeof(struct sockaddr_rose) && addr->srose_ndigis > 1)
return -EINVAL;

- if (addr->srose_ndigis > ROSE_MAX_DIGIS)
+ if ((unsigned int) addr->srose_ndigis > ROSE_MAX_DIGIS)
return -EINVAL;

if ((dev = rose_dev_get(&addr->srose_addr)) == NULL) {
@@ -740,7 +740,7 @@ static int rose_connect(struct socket *s
if (addr_len == sizeof(struct sockaddr_rose) && addr->srose_ndigis > 1)
return -EINVAL;

- if (addr->srose_ndigis > ROSE_MAX_DIGIS)
+ if ((unsigned int) addr->srose_ndigis > ROSE_MAX_DIGIS)
return -EINVAL;

/* Source + Destination digis should not exceed ROSE_MAX_DIGIS */

2010-12-08 00:37:00

by Greg KH

[permalink] [raw]
Subject: [35/44] tcp: Fix race in tcp_poll

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------


From: Tom Marshall <[email protected]>

[ Upstream commit a4d258036ed9b2a1811c3670c6099203a0f284a0 ]

If a RST comes in immediately after checking sk->sk_err, tcp_poll will
return POLLIN but not POLLOUT. Fix this by checking sk->sk_err at the end
of tcp_poll. Additionally, ensure the correct order of operations on SMP
machines with memory barriers.

Signed-off-by: Tom Marshall <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/tcp.c | 7 +++++--
net/ipv4/tcp_input.c | 2 ++
2 files changed, 7 insertions(+), 2 deletions(-)

--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -347,8 +347,6 @@ unsigned int tcp_poll(struct file *file,
*/

mask = 0;
- if (sk->sk_err)
- mask = POLLERR;

/*
* POLLHUP is certainly not done right. But poll() doesn't
@@ -413,6 +411,11 @@ unsigned int tcp_poll(struct file *file,
if (tp->urg_data & TCP_URG_VALID)
mask |= POLLPRI;
}
+ /* This barrier is coupled with smp_wmb() in tcp_reset() */
+ smp_rmb();
+ if (sk->sk_err)
+ mask |= POLLERR;
+
return mask;
}

--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3617,6 +3617,8 @@ static void tcp_reset(struct sock *sk)
default:
sk->sk_err = ECONNRESET;
}
+ /* This barrier is coupled with smp_rmb() in tcp_poll() */
+ smp_wmb();

if (!sock_flag(sk, SOCK_DEAD))
sk->sk_error_report(sk);

2010-12-08 00:34:51

by Greg KH

[permalink] [raw]
Subject: [34/44] Limit sysctl_tcp_mem and sysctl_udp_mem initializers to prevent integer overflows.

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Robin Holt <[email protected]>

[ Problem was fixed differently upstream. -DaveM ]

On a 16TB x86_64 machine, sysctl_tcp_mem[2], sysctl_udp_mem[2], and
sysctl_sctp_mem[2] can integer overflow. Set limit such that they are
maximized without overflowing.

Signed-off-by: Robin Holt <[email protected]>
To: "David S. Miller" <[email protected]>
Cc: Willy Tarreau <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Alexey Kuznetsov <[email protected]>
Cc: "Pekka Savola (ipv6)" <[email protected]>
Cc: James Morris <[email protected]>
Cc: Hideaki YOSHIFUJI <[email protected]>
Cc: Patrick McHardy <[email protected]>
Cc: Vlad Yasevich <[email protected]>
Cc: Sridhar Samudrala <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/tcp.c | 4 +++-
net/ipv4/udp.c | 4 +++-
net/sctp/protocol.c | 4 +++-
3 files changed, 9 insertions(+), 3 deletions(-)

--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2754,12 +2754,14 @@ void __init tcp_init(void)

/* Set the pressure threshold to be a fraction of global memory that
* is up to 1/2 at 256 MB, decreasing toward zero with the amount of
- * memory, with a floor of 128 pages.
+ * memory, with a floor of 128 pages, and a ceiling that prevents an
+ * integer overflow.
*/
nr_pages = totalram_pages - totalhigh_pages;
limit = min(nr_pages, 1UL<<(28-PAGE_SHIFT)) >> (20-PAGE_SHIFT);
limit = (limit * (nr_pages >> (20-PAGE_SHIFT))) >> (PAGE_SHIFT-11);
limit = max(limit, 128UL);
+ limit = min(limit, INT_MAX * 4UL / 3 / 2);
sysctl_tcp_mem[0] = limit / 4 * 3;
sysctl_tcp_mem[1] = limit;
sysctl_tcp_mem[2] = sysctl_tcp_mem[0] * 2;
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1722,11 +1722,13 @@ void __init udp_init(void)

/* Set the pressure threshold up by the same strategy of TCP. It is a
* fraction of global memory that is up to 1/2 at 256 MB, decreasing
- * toward zero with the amount of memory, with a floor of 128 pages.
+ * toward zero with the amount of memory, with a floor of 128 pages,
+ * and a ceiling that prevents an integer overflow.
*/
limit = min(nr_all_pages, 1UL<<(28-PAGE_SHIFT)) >> (20-PAGE_SHIFT);
limit = (limit * (nr_all_pages >> (20-PAGE_SHIFT))) >> (PAGE_SHIFT-11);
limit = max(limit, 128UL);
+ limit = min(limit, INT_MAX * 4UL / 3 / 2);
sysctl_udp_mem[0] = limit / 4 * 3;
sysctl_udp_mem[1] = limit;
sysctl_udp_mem[2] = sysctl_udp_mem[0] * 2;
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -1179,7 +1179,8 @@ SCTP_STATIC __init int sctp_init(void)

/* Set the pressure threshold to be a fraction of global memory that
* is up to 1/2 at 256 MB, decreasing toward zero with the amount of
- * memory, with a floor of 128 pages.
+ * memory, with a floor of 128 pages, and a ceiling that prevents an
+ * integer overflow.
* Note this initalizes the data in sctpv6_prot too
* Unabashedly stolen from tcp_init
*/
@@ -1187,6 +1188,7 @@ SCTP_STATIC __init int sctp_init(void)
limit = min(nr_pages, 1UL<<(28-PAGE_SHIFT)) >> (20-PAGE_SHIFT);
limit = (limit * (nr_pages >> (20-PAGE_SHIFT))) >> (PAGE_SHIFT-11);
limit = max(limit, 128UL);
+ limit = min(limit, INT_MAX * 4UL / 3 / 2);
sysctl_sctp_mem[0] = limit / 4 * 3;
sysctl_sctp_mem[1] = limit;
sysctl_sctp_mem[2] = sysctl_sctp_mem[0] * 2;

2010-12-08 00:37:33

by Greg KH

[permalink] [raw]
Subject: [33/44] net: Fix the condition passed to sk_wait_event()

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------


From: Nagendra Tomar <[email protected]>

[ Upstream commit 482964e56e1320cb7952faa1932d8ecf59c4bf75 ]

This patch fixes the condition (3rd arg) passed to sk_wait_event() in
sk_stream_wait_memory(). The incorrect check in sk_stream_wait_memory()
causes the following soft lockup in tcp_sendmsg() when the global tcp
memory pool has exhausted.

>>> snip <<<

localhost kernel: BUG: soft lockup - CPU#3 stuck for 11s! [sshd:6429]
localhost kernel: CPU 3:
localhost kernel: RIP: 0010:[sk_stream_wait_memory+0xcd/0x200] [sk_stream_wait_memory+0xcd/0x200] sk_stream_wait_memory+0xcd/0x200
localhost kernel:
localhost kernel: Call Trace:
localhost kernel: [sk_stream_wait_memory+0x1b1/0x200] sk_stream_wait_memory+0x1b1/0x200
localhost kernel: [<ffffffff802557c0>] autoremove_wake_function+0x0/0x40
localhost kernel: [ipv6:tcp_sendmsg+0x6e6/0xe90] tcp_sendmsg+0x6e6/0xce0
localhost kernel: [sock_aio_write+0x126/0x140] sock_aio_write+0x126/0x140
localhost kernel: [xfs:do_sync_write+0xf1/0x130] do_sync_write+0xf1/0x130
localhost kernel: [<ffffffff802557c0>] autoremove_wake_function+0x0/0x40
localhost kernel: [hrtimer_start+0xe3/0x170] hrtimer_start+0xe3/0x170
localhost kernel: [vfs_write+0x185/0x190] vfs_write+0x185/0x190
localhost kernel: [sys_write+0x50/0x90] sys_write+0x50/0x90
localhost kernel: [system_call+0x7e/0x83] system_call+0x7e/0x83

>>> snip <<<

What is happening is, that the sk_wait_event() condition passed from
sk_stream_wait_memory() evaluates to true for the case of tcp global memory
exhaustion. This is because both sk_stream_memory_free() and vm_wait are true
which causes sk_wait_event() to *not* call schedule_timeout().
Hence sk_stream_wait_memory() returns immediately to the caller w/o sleeping.
This causes the caller to again try allocation, which again fails and again
calls sk_stream_wait_memory(), and so on.

[ Bug introduced by commit c1cbe4b7ad0bc4b1d98ea708a3fecb7362aa4088
("[NET]: Avoid atomic xchg() for non-error case") -DaveM ]

Signed-off-by: Nagendra Singh Tomar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/core/stream.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

--- a/net/core/stream.c
+++ b/net/core/stream.c
@@ -139,10 +139,10 @@ int sk_stream_wait_memory(struct sock *s

set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
sk->sk_write_pending++;
- sk_wait_event(sk, &current_timeo, !sk->sk_err &&
- !(sk->sk_shutdown & SEND_SHUTDOWN) &&
- sk_stream_memory_free(sk) &&
- vm_wait);
+ sk_wait_event(sk, &current_timeo, sk->sk_err ||
+ (sk->sk_shutdown & SEND_SHUTDOWN) ||
+ (sk_stream_memory_free(sk) &&
+ !vm_wait));
sk->sk_write_pending--;

if (vm_wait) {

2010-12-08 00:37:53

by Greg KH

[permalink] [raw]
Subject: [30/44] xfrm4: strip ECN and IP Precedence bits in policy lookup

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------


From: Ulrich Weber <[email protected]>

[ Upstream commit 94e2238969e89f5112297ad2a00103089dde7e8f ]

dont compare ECN and IP Precedence bits in find_bundle
and use ECN bit stripped TOS value in xfrm_lookup

Signed-off-by: Ulrich Weber <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/xfrm4_policy.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/net/ipv4/xfrm4_policy.c
+++ b/net/ipv4/xfrm4_policy.c
@@ -69,7 +69,7 @@ __xfrm4_find_bundle(struct flowi *fl, st
if (xdst->u.rt.fl.oif == fl->oif && /*XXX*/
xdst->u.rt.fl.fl4_dst == fl->fl4_dst &&
xdst->u.rt.fl.fl4_src == fl->fl4_src &&
- xdst->u.rt.fl.fl4_tos == fl->fl4_tos &&
+ !((xdst->u.rt.fl.fl4_tos ^ fl->fl4_tos) & IPTOS_RT_MASK) &&
xfrm_bundle_ok(policy, xdst, fl, AF_INET, 0)) {
dst_clone(dst);
break;
@@ -81,7 +81,7 @@ __xfrm4_find_bundle(struct flowi *fl, st

static int xfrm4_get_tos(struct flowi *fl)
{
- return fl->fl4_tos;
+ return IPTOS_RT_MASK & fl->fl4_tos; /* Strip ECN bits */
}

static int xfrm4_init_path(struct xfrm_dst *path, struct dst_entry *dst,

2010-12-08 00:38:14

by Greg KH

[permalink] [raw]
Subject: [29/44] net: clear heap allocations for privileged ethtool actions

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------


From: Kees Cook <[email protected]>

[ Upstream commit b00916b189d13a615ff05c9242201135992fcda3 ]

Several other ethtool functions leave heap uncleared (potentially) by
drivers. Some interfaces appear safe (eeprom, etc), in that the sizes
are well controlled. In some situations (e.g. unchecked error conditions),
the heap will remain unchanged in areas before copying back to userspace.
Note that these are less of an issue since these all require CAP_NET_ADMIN.

Cc: [email protected]
Signed-off-by: Kees Cook <[email protected]>
Acked-by: Ben Hutchings <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/core/ethtool.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -256,7 +256,7 @@ static int ethtool_get_regs(struct net_d
if (regs.len > reglen)
regs.len = reglen;

- regbuf = kmalloc(reglen, GFP_USER);
+ regbuf = kzalloc(reglen, GFP_USER);
if (!regbuf)
return -ENOMEM;


2010-12-08 00:38:31

by Greg KH

[permalink] [raw]
Subject: [27/44] DECnet: dont leak uninitialized stack byte

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Dan Rosenberg <[email protected]>

commit 3c6f27bf33052ea6ba9d82369fb460726fb779c0 upstream.

A single uninitialized padding byte is leaked to userspace.

Signed-off-by: Dan Rosenberg <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/decnet/af_decnet.c | 2 ++
1 file changed, 2 insertions(+)

--- a/net/decnet/af_decnet.c
+++ b/net/decnet/af_decnet.c
@@ -1558,6 +1558,8 @@ static int __dn_getsockopt(struct socket
if (r_len > sizeof(struct linkinfo_dn))
r_len = sizeof(struct linkinfo_dn);

+ memset(&link, 0, sizeof(link));
+
switch(sock->state) {
case SS_CONNECTING:
link.idn_linkstate = LL_CONNECTING;

2010-12-08 00:38:43

by Greg KH

[permalink] [raw]
Subject: [26/44] do_exit(): make sure that we run with get_fs() == USER_DS

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Nelson Elhage <[email protected]>

commit 33dd94ae1ccbfb7bf0fb6c692bc3d1c4269e6177 upstream.

If a user manages to trigger an oops with fs set to KERNEL_DS, fs is not
otherwise reset before do_exit(). do_exit may later (via mm_release in
fork.c) do a put_user to a user-controlled address, potentially allowing
a user to leverage an oops into a controlled write into kernel memory.

This is only triggerable in the presence of another bug, but this
potentially turns a lot of DoS bugs into privilege escalations, so it's
worth fixing. I have proof-of-concept code which uses this bug along
with CVE-2010-3849 to write a zero to an arbitrary kernel address, so
I've tested that this is not theoretical.

A more logical place to put this fix might be when we know an oops has
occurred, before we call do_exit(), but that would involve changing
every architecture, in multiple places.

Let's just stick it in do_exit instead.

[[email protected]: update code comment]
Signed-off-by: Nelson Elhage <[email protected]>
Cc: KOSAKI Motohiro <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/exit.c | 9 +++++++++
1 file changed, 9 insertions(+)

--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1004,6 +1004,15 @@ NORET_TYPE void do_exit(long code)
if (unlikely(!tsk->pid))
panic("Attempted to kill the idle task!");

+ /*
+ * If do_exit is called because this processes oopsed, it's possible
+ * that get_fs() was left as KERNEL_DS, so reset it to USER_DS before
+ * continuing. Amongst other possible reasons, this is to prevent
+ * mm_release()->clear_child_tid() from writing to a user-controlled
+ * kernel address.
+ */
+ set_fs(USER_DS);
+
tracehook_report_exit(&code);

/*

2010-12-08 00:34:35

by Greg KH

[permalink] [raw]
Subject: [20/44] USB: storage: sierra_ms: fix sysfs file attribute

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Greg Kroah-Hartman <[email protected]>

commit d9624e75f6ad94d8a0718c1fafa89186d271a78c upstream.

A non-writable sysfs file shouldn't have writable attributes.

Reported-by: Linus Torvalds <[email protected]>
Cc: Kevin Lloyd <[email protected]>
Cc: Matthew Dharm <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/usb/storage/sierra_ms.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/usb/storage/sierra_ms.c
+++ b/drivers/usb/storage/sierra_ms.c
@@ -120,7 +120,7 @@ static ssize_t show_truinst(struct devic
}
return result;
}
-static DEVICE_ATTR(truinst, S_IWUGO | S_IRUGO, show_truinst, NULL);
+static DEVICE_ATTR(truinst, S_IRUGO, show_truinst, NULL);

int sierra_ms_init(struct us_data *us)
{

2010-12-08 00:39:00

by Greg KH

[permalink] [raw]
Subject: [24/44] USB: misc: trancevibrator: fix up a sysfs attribute permission

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Greg Kroah-Hartman <[email protected]>

commit d489a4b3926bad571d404ca6508f6744b9602776 upstream.

It should not be writable by any user.

Reported-by: Linus Torvalds <[email protected]>
Cc: Sam Hocevar <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/usb/misc/trancevibrator.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/usb/misc/trancevibrator.c
+++ b/drivers/usb/misc/trancevibrator.c
@@ -85,7 +85,7 @@ static ssize_t set_speed(struct device *
return count;
}

-static DEVICE_ATTR(speed, S_IWUGO | S_IRUGO, show_speed, set_speed);
+static DEVICE_ATTR(speed, S_IRUGO | S_IWUSR, show_speed, set_speed);

static int tv_probe(struct usb_interface *interface,
const struct usb_device_id *id)

2010-12-08 00:39:17

by Greg KH

[permalink] [raw]
Subject: [22/44] USB: misc: cypress_cy7c63: fix up some sysfs attribute permissions

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Greg Kroah-Hartman <[email protected]>

commit c990600d340641150f7270470a64bd99a5c0b225 upstream.

They should not be writable by any user.

Reported-by: Linus Torvalds <[email protected]>
Cc: Oliver Bock <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/usb/misc/cypress_cy7c63.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

--- a/drivers/usb/misc/cypress_cy7c63.c
+++ b/drivers/usb/misc/cypress_cy7c63.c
@@ -195,11 +195,9 @@ static ssize_t get_port1_handler(struct
return read_port(dev, attr, buf, 1, CYPRESS_READ_PORT_ID1);
}

-static DEVICE_ATTR(port0, S_IWUGO | S_IRUGO,
- get_port0_handler, set_port0_handler);
+static DEVICE_ATTR(port0, S_IRUGO | S_IWUSR, get_port0_handler, set_port0_handler);

-static DEVICE_ATTR(port1, S_IWUGO | S_IRUGO,
- get_port1_handler, set_port1_handler);
+static DEVICE_ATTR(port1, S_IRUGO | S_IWUSR, get_port1_handler, set_port1_handler);


static int cypress_probe(struct usb_interface *interface,

2010-12-08 00:34:34

by Greg KH

[permalink] [raw]
Subject: [19/44] USB: EHCI: fix obscure race in ehci_endpoint_disable

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Alan Stern <[email protected]>

commit 02e2c51ba3e80acde600721ea784c3ef84da5ea1 upstream.

This patch (as1435) fixes an obscure and unlikely race in ehci-hcd.
When an async URB is unlinked, the corresponding QH is removed from
the async list. If the QH's endpoint is then disabled while the URB
is being given back, ehci_endpoint_disable() won't find the QH on the
async list, causing it to believe that the QH has been lost. This
will lead to a memory leak at best and quite possibly to an oops.

The solution is to trust usbcore not to lose track of endpoints. If
the QH isn't on the async list then it doesn't need to be taken off
the list, but the driver should still wait for the QH to become IDLE
before disabling it.

In theory this fixes Bugzilla #20182. In fact the race is so rare
that it's not possible to tell whether the bug is still present.
However, adding delays and making other changes to force the race
seems to show that the patch works.

Signed-off-by: Alan Stern <[email protected]>
Reported-by: Stefan Richter <[email protected]>
CC: David Brownell <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/usb/host/ehci-hcd.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

--- a/drivers/usb/host/ehci-hcd.c
+++ b/drivers/usb/host/ehci-hcd.c
@@ -954,10 +954,11 @@ rescan:
tmp && tmp != qh;
tmp = tmp->qh_next.qh)
continue;
- /* periodic qh self-unlinks on empty */
- if (!tmp)
- goto nogood;
- unlink_async (ehci, qh);
+ /* periodic qh self-unlinks on empty, and a COMPLETING qh
+ * may already be unlinked.
+ */
+ if (tmp)
+ unlink_async(ehci, qh);
/* FALL THROUGH */
case QH_STATE_UNLINK: /* wait for hw to finish? */
case QH_STATE_UNLINK_WAIT:
@@ -972,7 +973,6 @@ idle_timeout:
}
/* else FALL THROUGH */
default:
-nogood:
/* caller was supposed to have unlinked any requests;
* that's not our job. just leak this memory.
*/

2010-12-08 00:39:37

by Greg KH

[permalink] [raw]
Subject: [21/44] USB: atm: ueagle-atm: fix up some permissions on the sysfs files

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Greg Kroah-Hartman <[email protected]>

commit e502ac5e1eca99d7dc3f12b2a6780ccbca674858 upstream.

Some of the sysfs files had the incorrect permissions. Some didn't make
sense at all (writable for a file that you could not write to?)

Reported-by: Linus Torvalds <[email protected]>
Cc: Matthieu Castet <[email protected]>
Cc: Stanislaw Gruszka <[email protected]>
Cc: Damien Bergamini <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/usb/atm/ueagle-atm.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

--- a/drivers/usb/atm/ueagle-atm.c
+++ b/drivers/usb/atm/ueagle-atm.c
@@ -2258,7 +2258,7 @@ out:
return ret;
}

-static DEVICE_ATTR(stat_status, S_IWUGO | S_IRUGO, read_status, reboot);
+static DEVICE_ATTR(stat_status, S_IWUSR | S_IRUGO, read_status, reboot);

static ssize_t read_human_status(struct device *dev, struct device_attribute *attr,
char *buf)
@@ -2321,7 +2321,7 @@ out:
return ret;
}

-static DEVICE_ATTR(stat_human_status, S_IWUGO | S_IRUGO, read_human_status, NULL);
+static DEVICE_ATTR(stat_human_status, S_IRUGO, read_human_status, NULL);

static ssize_t read_delin(struct device *dev, struct device_attribute *attr,
char *buf)
@@ -2353,7 +2353,7 @@ out:
return ret;
}

-static DEVICE_ATTR(stat_delin, S_IWUGO | S_IRUGO, read_delin, NULL);
+static DEVICE_ATTR(stat_delin, S_IRUGO, read_delin, NULL);

#define UEA_ATTR(name, reset) \
\

2010-12-08 00:39:56

by Greg KH

[permalink] [raw]
Subject: [18/44] usb: core: fix information leak to userland

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Vasiliy Kulikov <[email protected]>

commit 886ccd4520064408ce5876cfe00554ce52ecf4a7 upstream.

Structure usbdevfs_connectinfo is copied to userland with padding byted
after "slow" field uninitialized. It leads to leaking of contents of
kernel stack memory.

Signed-off-by: Vasiliy Kulikov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/usb/core/devio.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

--- a/drivers/usb/core/devio.c
+++ b/drivers/usb/core/devio.c
@@ -883,10 +883,11 @@ static int proc_getdriver(struct dev_sta

static int proc_connectinfo(struct dev_state *ps, void __user *arg)
{
- struct usbdevfs_connectinfo ci;
+ struct usbdevfs_connectinfo ci = {
+ .devnum = ps->dev->devnum,
+ .slow = ps->dev->speed == USB_SPEED_LOW
+ };

- ci.devnum = ps->dev->devnum;
- ci.slow = ps->dev->speed == USB_SPEED_LOW;
if (copy_to_user(arg, &ci, sizeof(ci)))
return -EFAULT;
return 0;

2010-12-08 00:40:21

by Greg KH

[permalink] [raw]
Subject: [17/44] usb: misc: iowarrior: fix information leak to userland

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Vasiliy Kulikov <[email protected]>

commit eca67aaeebd6e5d22b0d991af1dd0424dc703bfb upstream.

Structure iowarrior_info is copied to userland with padding byted
between "serial" and "revision" fields uninitialized. It leads to
leaking of contents of kernel stack memory.

Signed-off-by: Vasiliy Kulikov <[email protected]>
Acked-by: Kees Cook <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/usb/misc/iowarrior.c | 1 +
1 file changed, 1 insertion(+)

--- a/drivers/usb/misc/iowarrior.c
+++ b/drivers/usb/misc/iowarrior.c
@@ -551,6 +551,7 @@ static long iowarrior_ioctl(struct file
/* needed for power consumption */
struct usb_config_descriptor *cfg_descriptor = &dev->udev->actconfig->desc;

+ memset(&info, 0, sizeof(info));
/* directly from the descriptor */
info.vendor = le16_to_cpu(dev->udev->descriptor.idVendor);
info.product = dev->product_id;

2010-12-08 00:34:29

by Greg KH

[permalink] [raw]
Subject: [15/44] libata: fix NULL sdev dereference race in atapi_qc_complete()

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Tejun Heo <[email protected]>

commit 2a5f07b5ec098edc69e05fdd2f35d3fbb1235723 upstream.

SCSI commands may be issued between __scsi_add_device() and dev->sdev
assignment, so it's unsafe for ata_qc_complete() to dereference
dev->sdev->locked without checking whether it's NULL or not. Fix it.

Signed-off-by: Tejun Heo <[email protected]>
Signed-off-by: Jeff Garzik <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/ata/libata-scsi.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -2371,8 +2371,11 @@ static void atapi_qc_complete(struct ata
*
* If door lock fails, always clear sdev->locked to
* avoid this infinite loop.
+ *
+ * This may happen before SCSI scan is complete. Make
+ * sure qc->dev->sdev isn't NULL before dereferencing.
*/
- if (qc->cdb[0] == ALLOW_MEDIUM_REMOVAL)
+ if (qc->cdb[0] == ALLOW_MEDIUM_REMOVAL && qc->dev->sdev)
qc->dev->sdev->locked = 0;

qc->scsicmd->result = SAM_STAT_CHECK_CONDITION;

2010-12-08 00:40:54

by Greg KH

[permalink] [raw]
Subject: [14/44] bio: take care not overflow page count when mapping/copying user data

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Jens Axboe <[email protected]>

commit cb4644cac4a2797afc847e6c92736664d4b0ea34 upstream.

If the iovec is being set up in a way that causes uaddr + PAGE_SIZE
to overflow, we could end up attempting to map a huge number of
pages. Check for this invalid input type.

Reported-by: Dan Rosenberg <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/bio.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)

--- a/fs/bio.c
+++ b/fs/bio.c
@@ -593,6 +593,12 @@ struct bio *bio_copy_user_iov(struct req
end = (uaddr + iov[i].iov_len + PAGE_SIZE - 1) >> PAGE_SHIFT;
start = uaddr >> PAGE_SHIFT;

+ /*
+ * Overflow, abort
+ */
+ if (end < start)
+ return ERR_PTR(-EINVAL);
+
nr_pages += end - start;
len += iov[i].iov_len;
}
@@ -691,6 +697,12 @@ static struct bio *__bio_map_user_iov(st
unsigned long end = (uaddr + len + PAGE_SIZE - 1) >> PAGE_SHIFT;
unsigned long start = uaddr >> PAGE_SHIFT;

+ /*
+ * Overflow, abort
+ */
+ if (end < start)
+ return ERR_PTR(-EINVAL);
+
nr_pages += end - start;
/*
* buffer must be aligned to at least hardsector size for now
@@ -718,7 +730,7 @@ static struct bio *__bio_map_user_iov(st
unsigned long start = uaddr >> PAGE_SHIFT;
const int local_nr_pages = end - start;
const int page_limit = cur_page + local_nr_pages;
-
+
ret = get_user_pages_fast(uaddr, local_nr_pages,
write_to_vm, &pages[cur_page]);
if (ret < local_nr_pages) {

2010-12-08 00:34:26

by Greg KH

[permalink] [raw]
Subject: [12/44] drivers/char/vt_ioctl.c: fix VT_OPENQRY error value

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Graham Gower <[email protected]>

commit 1e0ad2881d50becaeea70ec696a80afeadf944d2 upstream.

When all VT's are in use, VT_OPENQRY casts -1 to unsigned char before
returning it to userspace as an int. VT255 is not the next available
console.

Signed-off-by: Graham Gower <[email protected]>
Cc: Greg KH <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/char/vt_ioctl.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

--- a/drivers/char/vt_ioctl.c
+++ b/drivers/char/vt_ioctl.c
@@ -371,6 +371,7 @@ int vt_ioctl(struct tty_struct *tty, str
struct kbd_struct * kbd;
unsigned int console;
unsigned char ucval;
+ unsigned int uival;
void __user *up = (void __user *)arg;
int i, perm;
int ret = 0;
@@ -516,7 +517,7 @@ int vt_ioctl(struct tty_struct *tty, str
break;

case KDGETMODE:
- ucval = vc->vc_mode;
+ uival = vc->vc_mode;
goto setint;

case KDMAPDISP:
@@ -554,7 +555,7 @@ int vt_ioctl(struct tty_struct *tty, str
break;

case KDGKBMODE:
- ucval = ((kbd->kbdmode == VC_RAW) ? K_RAW :
+ uival = ((kbd->kbdmode == VC_RAW) ? K_RAW :
(kbd->kbdmode == VC_MEDIUMRAW) ? K_MEDIUMRAW :
(kbd->kbdmode == VC_UNICODE) ? K_UNICODE :
K_XLATE);
@@ -576,9 +577,9 @@ int vt_ioctl(struct tty_struct *tty, str
break;

case KDGKBMETA:
- ucval = (vc_kbd_mode(kbd, VC_META) ? K_ESCPREFIX : K_METABIT);
+ uival = (vc_kbd_mode(kbd, VC_META) ? K_ESCPREFIX : K_METABIT);
setint:
- ret = put_user(ucval, (int __user *)arg);
+ ret = put_user(uival, (int __user *)arg);
break;

case KDGETKEYCODE:
@@ -808,7 +809,7 @@ int vt_ioctl(struct tty_struct *tty, str
for (i = 0; i < MAX_NR_CONSOLES; ++i)
if (! VT_IS_IN_USE(i))
break;
- ucval = i < MAX_NR_CONSOLES ? (i+1) : -1;
+ uival = i < MAX_NR_CONSOLES ? (i+1) : -1;
goto setint;

/*

2010-12-08 00:41:18

by Greg KH

[permalink] [raw]
Subject: [13/44] eCryptfs: Clear LOOKUP_OPEN flag when creating lower file

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Tyler Hicks <[email protected]>

commit 2e21b3f124eceb6ab5a07c8a061adce14ac94e14 upstream.

eCryptfs was passing the LOOKUP_OPEN flag through to the lower file
system, even though ecryptfs_create() doesn't support the flag. A valid
filp for the lower filesystem could be returned in the nameidata if the
lower file system's create() function supported LOOKUP_OPEN, possibly
resulting in unencrypted writes to the lower file.

However, this is only a potential problem in filesystems (FUSE, NFS,
CIFS, CEPH, 9p) that eCryptfs isn't known to support today.

https://bugs.launchpad.net/ecryptfs/+bug/641703

Reported-by: Kevin Buhr
Signed-off-by: Tyler Hicks <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/ecryptfs/inode.c | 4 ++++
1 file changed, 4 insertions(+)

--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -70,15 +70,19 @@ ecryptfs_create_underlying_file(struct i
struct vfsmount *lower_mnt = ecryptfs_dentry_to_lower_mnt(dentry);
struct dentry *dentry_save;
struct vfsmount *vfsmount_save;
+ unsigned int flags_save;
int rc;

dentry_save = nd->path.dentry;
vfsmount_save = nd->path.mnt;
+ flags_save = nd->flags;
nd->path.dentry = lower_dentry;
nd->path.mnt = lower_mnt;
+ nd->flags &= ~LOOKUP_OPEN;
rc = vfs_create(lower_dir_inode, lower_dentry, mode, nd);
nd->path.dentry = dentry_save;
nd->path.mnt = vfsmount_save;
+ nd->flags = flags_save;
return rc;
}


2010-12-08 00:34:25

by Greg KH

[permalink] [raw]
Subject: [07/44] mm: fix return value of scan_lru_pages in memory unplug

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: KAMEZAWA Hiroyuki <[email protected]>

commit f8f72ad5396987e05a42cf7eff826fb2a15ff148 upstream.

scan_lru_pages returns pfn. So, it's type should be "unsigned long"
not "int".

Note: I guess this has been work until now because memory hotplug tester's
machine has not very big memory....
physical address < 32bit << PAGE_SHIFT.

Reported-by: KOSAKI Motohiro <[email protected]>
Signed-off-by: KAMEZAWA Hiroyuki <[email protected]>
Reviewed-by: KOSAKI Motohiro <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
mm/memory_hotplug.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -612,7 +612,7 @@ static int test_pages_in_a_zone(unsigned
* Scanning pfn is much easier than scanning lru list.
* Scan pfn from start to end and Find LRU page.
*/
-int scan_lru_pages(unsigned long start, unsigned long end)
+unsigned long scan_lru_pages(unsigned long start, unsigned long end)
{
unsigned long pfn;
struct page *page;

2010-12-08 00:41:42

by Greg KH

[permalink] [raw]
Subject: [11/44] sys_semctl: fix kernel stack leakage

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Dan Rosenberg <[email protected]>

commit 982f7c2b2e6a28f8f266e075d92e19c0dd4c6e56 upstream.

The semctl syscall has several code paths that lead to the leakage of
uninitialized kernel stack memory (namely the IPC_INFO, SEM_INFO,
IPC_STAT, and SEM_STAT commands) during the use of the older, obsolete
version of the semid_ds struct.

The copy_semid_to_user() function declares a semid_ds struct on the stack
and copies it back to the user without initializing or zeroing the
"sem_base", "sem_pending", "sem_pending_last", and "undo" pointers,
allowing the leakage of 16 bytes of kernel stack memory.

The code is still reachable on 32-bit systems - when calling semctl()
newer glibc's automatically OR the IPC command with the IPC_64 flag, but
invoking the syscall directly allows users to use the older versions of
the struct.

Signed-off-by: Dan Rosenberg <[email protected]>
Cc: Manfred Spraul <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
ipc/sem.c | 2 ++
1 file changed, 2 insertions(+)

--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -560,6 +560,8 @@ static unsigned long copy_semid_to_user(
{
struct semid_ds out;

+ memset(&out, 0, sizeof(out));
+
ipc64_perm_to_ipc_perm(&in->sem_perm, &out.sem_perm);

out.sem_otime = in->sem_otime;

2010-12-08 00:34:24

by Greg KH

[permalink] [raw]
Subject: [06/44] numa: fix slab_node(MPOL_BIND)

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Eric Dumazet <[email protected]>

commit 800416f799e0723635ac2d720ad4449917a1481c upstream.

When a node contains only HighMem memory, slab_node(MPOL_BIND)
dereferences a NULL pointer.

[ This code seems to go back all the way to commit 19770b32609b: "mm:
filter based on a nodemask as well as a gfp_mask". Which was back in
April 2008, and it got merged into 2.6.26. - Linus ]

Signed-off-by: Eric Dumazet <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Lee Schermerhorn <[email protected]>
Cc: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
mm/mempolicy.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1404,7 +1404,7 @@ unsigned slab_node(struct mempolicy *pol
(void)first_zones_zonelist(zonelist, highest_zoneidx,
&policy->v.nodes,
&zone);
- return zone->node;
+ return zone ? zone->node : numa_node_id();
}

default:

2010-12-08 00:42:00

by Greg KH

[permalink] [raw]
Subject: [10/44] ipc: shm: fix information leak to userland

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Vasiliy Kulikov <[email protected]>

commit 3af54c9bd9e6f14f896aac1bb0e8405ae0bc7a44 upstream.

The shmid_ds structure is copied to userland with shm_unused{,2,3}
fields unitialized. It leads to leaking of contents of kernel stack
memory.

Signed-off-by: Vasiliy Kulikov <[email protected]>
Acked-by: Al Viro <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
ipc/shm.c | 1 +
1 file changed, 1 insertion(+)

--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -469,6 +469,7 @@ static inline unsigned long copy_shmid_t
{
struct shmid_ds out;

+ memset(&out, 0, sizeof(out));
ipc64_perm_to_ipc_perm(&in->shm_perm, &out.shm_perm);
out.shm_segsz = in->shm_segsz;
out.shm_atime = in->shm_atime;

2010-12-08 00:34:23

by Greg KH

[permalink] [raw]
Subject: [05/44] um: fix global timer issue when using CONFIG_NO_HZ

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Richard Weinberger <[email protected]>

commit 482db6df1746c4fa7d64a2441d4cb2610249c679 upstream.

This fixes a issue which was introduced by fe2cc53e ("uml: track and make
up lost ticks").

timeval_to_ns() returns long long and not int. Due to that UML's timer
did not work properlt and caused timer freezes.

Signed-off-by: Richard Weinberger <[email protected]>
Acked-by: Pekka Enberg <[email protected]>
Cc: Jeff Dike <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/um/os-Linux/time.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/um/os-Linux/time.c
+++ b/arch/um/os-Linux/time.c
@@ -60,7 +60,7 @@ static inline long long timeval_to_ns(co
long long disable_timer(void)
{
struct itimerval time = ((struct itimerval) { { 0, 0 }, { 0, 0 } });
- int remain, max = UM_NSEC_PER_SEC / UM_HZ;
+ long long remain, max = UM_NSEC_PER_SEC / UM_HZ;

if (setitimer(ITIMER_VIRTUAL, &time, &time) < 0)
printk(UM_KERN_ERR "disable_timer - setitimer failed, "

2010-12-08 00:42:18

by Greg KH

[permalink] [raw]
Subject: [09/44] ipc: initialize structure memory to zero for compat functions

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Dan Rosenberg <[email protected]>

commit 03145beb455cf5c20a761e8451e30b8a74ba58d9 upstream.

This takes care of leaking uninitialized kernel stack memory to
userspace from non-zeroed fields in structs in compat ipc functions.

Signed-off-by: Dan Rosenberg <[email protected]>
Cc: Manfred Spraul <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
ipc/compat.c | 6 ++++++
ipc/compat_mq.c | 5 +++++
2 files changed, 11 insertions(+)

--- a/ipc/compat.c
+++ b/ipc/compat.c
@@ -242,6 +242,8 @@ long compat_sys_semctl(int first, int se
struct semid64_ds __user *up64;
int version = compat_ipc_parse_version(&third);

+ memset(&s64, 0, sizeof(s64));
+
if (!uptr)
return -EINVAL;
if (get_user(pad, (u32 __user *) uptr))
@@ -422,6 +424,8 @@ long compat_sys_msgctl(int first, int se
int version = compat_ipc_parse_version(&second);
void __user *p;

+ memset(&m64, 0, sizeof(m64));
+
switch (second & (~IPC_64)) {
case IPC_INFO:
case IPC_RMID:
@@ -595,6 +599,8 @@ long compat_sys_shmctl(int first, int se
int err, err2;
int version = compat_ipc_parse_version(&second);

+ memset(&s64, 0, sizeof(s64));
+
switch (second & (~IPC_64)) {
case IPC_RMID:
case SHM_LOCK:
--- a/ipc/compat_mq.c
+++ b/ipc/compat_mq.c
@@ -53,6 +53,9 @@ asmlinkage long compat_sys_mq_open(const
void __user *p = NULL;
if (u_attr && oflag & O_CREAT) {
struct mq_attr attr;
+
+ memset(&attr, 0, sizeof(attr));
+
p = compat_alloc_user_space(sizeof(attr));
if (get_compat_mq_attr(&attr, u_attr) ||
copy_to_user(p, &attr, sizeof(attr)))
@@ -127,6 +130,8 @@ asmlinkage long compat_sys_mq_getsetattr
struct mq_attr __user *p = compat_alloc_user_space(2 * sizeof(*p));
long ret;

+ memset(&mqstat, 0, sizeof(mqstat));
+
if (u_mqstat) {
if (get_compat_mq_attr(&mqstat, u_mqstat) ||
copy_to_user(p, &mqstat, sizeof(mqstat)))

2010-12-08 00:42:37

by Greg KH

[permalink] [raw]
Subject: [08/44] mm: fix is_mem_section_removable() page_order BUG_ON check

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: KAMEZAWA Hiroyuki <[email protected]>

commit 572438f9b52236bd8938b1647cc15e027d27ef55 upstream.

page_order() is called by memory hotplug's user interface to check the
section is removable or not. (is_mem_section_removable())

It calls page_order() withoug holding zone->lock.
So, even if the caller does

if (PageBuddy(page))
ret = page_order(page) ...
The caller may hit BUG_ON().

For fixing this, there are 2 choices.
1. add zone->lock.
2. remove BUG_ON().

is_mem_section_removable() is used for some "advice" and doesn't need to
be 100% accurate. This is_removable() can be called via user program..
We don't want to take this important lock for long by user's request. So,
this patch removes BUG_ON().

Signed-off-by: KAMEZAWA Hiroyuki <[email protected]>
Acked-by: Wu Fengguang <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Acked-by: Mel Gorman <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
mm/internal.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/internal.h
+++ b/mm/internal.h
@@ -49,7 +49,7 @@ extern void __free_pages_bootmem(struct
*/
static inline unsigned long page_order(struct page *page)
{
- VM_BUG_ON(!PageBuddy(page));
+ /* PageBuddy() must be checked by the caller */
return page_private(page);
}


2010-12-08 00:42:38

by Greg KH

[permalink] [raw]
Subject: [04/44] percpu: fix list_head init bug in __percpu_counter_init()

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Masanori ITOH <[email protected]>

commit 8474b591faf3bb0a1e08a60d21d6baac498f15e4 upstream.

WARNING: at lib/list_debug.c:26 __list_add+0x3f/0x81()
Hardware name: Express5800/B120a [N8400-085]
list_add corruption. next->prev should be prev (ffffffff81a7ea00), but was dead000000200200. (next=ffff88080b872d58).
Modules linked in: aoe ipt_MASQUERADE iptable_nat nf_nat autofs4 sunrpc bridge 8021q garp stp llc ipv6 cpufreq_ondemand acpi_cpufreq freq_table dm_round_robin dm_multipath kvm_intel kvm uinput lpfc scsi_transport_fc igb ioatdma scsi_tgt i2c_i801 i2c_core dca iTCO_wdt iTCO_vendor_support pcspkr shpchp megaraid_sas [last unloaded: aoe]
Pid: 54, comm: events/3 Tainted: G W 2.6.34-vanilla1 #1
Call Trace:
[<ffffffff8104bd77>] warn_slowpath_common+0x7c/0x94
[<ffffffff8104bde6>] warn_slowpath_fmt+0x41/0x43
[<ffffffff8120fd2e>] __list_add+0x3f/0x81
[<ffffffff81212a12>] __percpu_counter_init+0x59/0x6b
[<ffffffff810d8499>] bdi_init+0x118/0x17e
[<ffffffff811f2c50>] blk_alloc_queue_node+0x79/0x143
[<ffffffff811f2d2b>] blk_alloc_queue+0x11/0x13
[<ffffffffa02a931d>] aoeblk_gdalloc+0x8e/0x1c9 [aoe]
[<ffffffffa02aa655>] aoecmd_sleepwork+0x25/0xa8 [aoe]
[<ffffffff8106186c>] worker_thread+0x1a9/0x237
[<ffffffffa02aa630>] ? aoecmd_sleepwork+0x0/0xa8 [aoe]
[<ffffffff81065827>] ? autoremove_wake_function+0x0/0x39
[<ffffffff810616c3>] ? worker_thread+0x0/0x237
[<ffffffff810653ad>] kthread+0x7f/0x87
[<ffffffff8100aa24>] kernel_thread_helper+0x4/0x10
[<ffffffff8106532e>] ? kthread+0x0/0x87
[<ffffffff8100aa20>] ? kernel_thread_helper+0x0/0x10

It's because there is no initialization code for a list_head contained in
the struct backing_dev_info under CONFIG_HOTPLUG_CPU, and the bug comes up
when block device drivers calling blk_alloc_queue() are used. In case of
me, I got them by using aoe.

Signed-off-by: Masanori Itoh <[email protected]>
Cc: Tejun Heo <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
lib/percpu_counter.c | 1 +
1 file changed, 1 insertion(+)

--- a/lib/percpu_counter.c
+++ b/lib/percpu_counter.c
@@ -81,6 +81,7 @@ int percpu_counter_init(struct percpu_co
if (!fbc->counters)
return -ENOMEM;
#ifdef CONFIG_HOTPLUG_CPU
+ INIT_LIST_HEAD(&fbc->list);
mutex_lock(&percpu_counters_lock);
list_add(&fbc->list, &percpu_counters);
mutex_unlock(&percpu_counters_lock);

2010-12-08 00:43:12

by Greg KH

[permalink] [raw]
Subject: [02/44] irda: Fix parameter extraction stack overflow

2.6.27-stable review patch. If anyone has any objections, please let us know.

------------------

From: Samuel Ortiz <[email protected]>

commit efc463eb508798da4243625b08c7396462cabf9f upstream.

Reported-by: Ilja Van Sprundel <[email protected]>
Signed-off-by: Samuel Ortiz <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/irda/parameters.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

--- a/net/irda/parameters.c
+++ b/net/irda/parameters.c
@@ -298,6 +298,8 @@ static int irda_extract_string(void *sel

p.pi = pi; /* In case handler needs to know */
p.pl = buf[1]; /* Extract length of value */
+ if (p.pl > 32)
+ p.pl = 32;

IRDA_DEBUG(2, "%s(), pi=%#x, pl=%d\n", __func__,
p.pi, p.pl);
@@ -318,7 +320,7 @@ static int irda_extract_string(void *sel
(__u8) str[0], (__u8) str[1]);

/* Null terminate string */
- str[p.pl+1] = '\0';
+ str[p.pl] = '\0';

p.pv.c = str; /* Handler will need to take a copy */


2010-12-08 01:33:00

by Linus Torvalds

[permalink] [raw]
Subject: Re: [34/44] Limit sysctl_tcp_mem and sysctl_udp_mem initializers to prevent integer overflows.

On Tue, Dec 7, 2010 at 4:04 PM, Greg KH <[email protected]> wrote:
>
> From: Robin Holt <[email protected]>
>
> [ Problem was fixed differently upstream. -DaveM ]

Gaah. I'd really like to see more of a description for things like
this. A commit ID for the alternate fix, or at least a few words about
the different fix or reason why upstream doesn't need the stable
commit.

Linus

2010-12-08 03:03:25

by Lee Schermerhorn

[permalink] [raw]
Subject: Re: [06/44] numa: fix slab_node(MPOL_BIND)

On Tue, 2010-12-07 at 16:04 -0800, Greg KH wrote:
> 2.6.27-stable review patch. If anyone has any objections, please let us know.
>
> ------------------
>
> From: Eric Dumazet <[email protected]>
>
> commit 800416f799e0723635ac2d720ad4449917a1481c upstream.
>
> When a node contains only HighMem memory, slab_node(MPOL_BIND)
> dereferences a NULL pointer.
>
> [ This code seems to go back all the way to commit 19770b32609b: "mm:
> filter based on a nodemask as well as a gfp_mask". Which was back in
> April 2008, and it got merged into 2.6.26. - Linus ]
>
> Signed-off-by: Eric Dumazet <[email protected]>
> Cc: Mel Gorman <[email protected]>
> Cc: Christoph Lameter <[email protected]>
> Cc: Lee Schermerhorn <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Signed-off-by: Linus Torvalds <[email protected]>
> Signed-off-by: Greg Kroah-Hartman <[email protected]>
>
> ---
> mm/mempolicy.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1404,7 +1404,7 @@ unsigned slab_node(struct mempolicy *pol
> (void)first_zones_zonelist(zonelist, highest_zoneidx,
> &policy->v.nodes,
> &zone);
> - return zone->node;
> + return zone ? zone->node : numa_node_id();

I think this should be numa_mem_id(). Given the documented purpose of
slab_node(), we want a node from which page allocation is likely to
succeed. numa_node_id() can return a memoryless node for, e.g., some
configurations of some HP ia64 platforms. numa_mem_id() was introduced
to return that same node from which "local" mempolicy would allocate
pages.

Lee

> }
>
> default:
>
>



2010-12-08 03:03:22

by Lee Schermerhorn

[permalink] [raw]
Subject: Re: [06/44] numa: fix slab_node(MPOL_BIND)

On Tue, 2010-12-07 at 16:04 -0800, Greg KH wrote:
> 2.6.27-stable review patch. If anyone has any objections, please let us know.
>
> ------------------
>
> From: Eric Dumazet <[email protected]>
>
> commit 800416f799e0723635ac2d720ad4449917a1481c upstream.
>
> When a node contains only HighMem memory, slab_node(MPOL_BIND)
> dereferences a NULL pointer.
>
> [ This code seems to go back all the way to commit 19770b32609b: "mm:
> filter based on a nodemask as well as a gfp_mask". Which was back in
> April 2008, and it got merged into 2.6.26. - Linus ]
>
> Signed-off-by: Eric Dumazet <[email protected]>
> Cc: Mel Gorman <[email protected]>
> Cc: Christoph Lameter <[email protected]>
> Cc: Lee Schermerhorn <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Signed-off-by: Linus Torvalds <[email protected]>
> Signed-off-by: Greg Kroah-Hartman <[email protected]>
>
> ---
> mm/mempolicy.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1404,7 +1404,7 @@ unsigned slab_node(struct mempolicy *pol
> (void)first_zones_zonelist(zonelist, highest_zoneidx,
> &policy->v.nodes,
> &zone);
> - return zone->node;
> + return zone ? zone->node : numa_node_id();

I think this should be numa_mem_id(). Given the documented purpose of
slab_node(), we want a node from which page allocation is likely to
succeed. numa_node_id() can return a memoryless node for, e.g., some
configurations of some HP ia64 platforms. numa_mem_id() was introduced
to return that same node from which "local" mempolicy would allocate
pages.

Lee

> }
>
> default:
>
>


2010-12-08 04:19:49

by Greg KH

[permalink] [raw]
Subject: Re: [06/44] numa: fix slab_node(MPOL_BIND)

On Tue, Dec 07, 2010 at 10:03:42PM -0500, Lee Schermerhorn wrote:
> On Tue, 2010-12-07 at 16:04 -0800, Greg KH wrote:
> > 2.6.27-stable review patch. If anyone has any objections, please let us know.
> >
> > ------------------
> >
> > From: Eric Dumazet <[email protected]>
> >
> > commit 800416f799e0723635ac2d720ad4449917a1481c upstream.
> >
> > When a node contains only HighMem memory, slab_node(MPOL_BIND)
> > dereferences a NULL pointer.
> >
> > [ This code seems to go back all the way to commit 19770b32609b: "mm:
> > filter based on a nodemask as well as a gfp_mask". Which was back in
> > April 2008, and it got merged into 2.6.26. - Linus ]
> >
> > Signed-off-by: Eric Dumazet <[email protected]>
> > Cc: Mel Gorman <[email protected]>
> > Cc: Christoph Lameter <[email protected]>
> > Cc: Lee Schermerhorn <[email protected]>
> > Cc: Andrew Morton <[email protected]>
> > Signed-off-by: Linus Torvalds <[email protected]>
> > Signed-off-by: Greg Kroah-Hartman <[email protected]>
> >
> > ---
> > mm/mempolicy.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > --- a/mm/mempolicy.c
> > +++ b/mm/mempolicy.c
> > @@ -1404,7 +1404,7 @@ unsigned slab_node(struct mempolicy *pol
> > (void)first_zones_zonelist(zonelist, highest_zoneidx,
> > &policy->v.nodes,
> > &zone);
> > - return zone->node;
> > + return zone ? zone->node : numa_node_id();
>
> I think this should be numa_mem_id(). Given the documented purpose of
> slab_node(), we want a node from which page allocation is likely to
> succeed. numa_node_id() can return a memoryless node for, e.g., some
> configurations of some HP ia64 platforms. numa_mem_id() was introduced
> to return that same node from which "local" mempolicy would allocate
> pages.

So should the upstream patch be changed?

thanks,

greg k-h

2010-12-08 04:33:10

by Eric Dumazet

[permalink] [raw]
Subject: Re: [06/44] numa: fix slab_node(MPOL_BIND)

Le mardi 07 décembre 2010 à 22:03 -0500, Lee Schermerhorn a écrit :
> On Tue, 2010-12-07 at 16:04 -0800, Greg KH wrote:
> > 2.6.27-stable review patch. If anyone has any objections, please let us know.
> >
> > ------------------
> >
> > From: Eric Dumazet <[email protected]>
> >
> > commit 800416f799e0723635ac2d720ad4449917a1481c upstream.
> >

> >
> > --- a/mm/mempolicy.c
> > +++ b/mm/mempolicy.c
> > @@ -1404,7 +1404,7 @@ unsigned slab_node(struct mempolicy *pol
> > (void)first_zones_zonelist(zonelist, highest_zoneidx,
> > &policy->v.nodes,
> > &zone);
> > - return zone->node;
> > + return zone ? zone->node : numa_node_id();
>
> I think this should be numa_mem_id(). Given the documented purpose of
> slab_node(), we want a node from which page allocation is likely to
> succeed. numa_node_id() can return a memoryless node for, e.g., some
> configurations of some HP ia64 platforms. numa_mem_id() was introduced
> to return that same node from which "local" mempolicy would allocate
> pages.

Hmm... numa_mem_id() was introduced in 2.6.35 as an optimization.

When I did this patch (to fix a bug), mm/mempolicy.c only contained
calls to numa_node_id() (and still is today)

By the way, anybody knows how I can emulate a memoryless node on a dual
node x86_64 machine (with memory present on both nodes) ?


2010-12-08 04:37:13

by Eric Dumazet

[permalink] [raw]
Subject: Re: [06/44] numa: fix slab_node(MPOL_BIND)

Le mardi 07 décembre 2010 à 20:17 -0800, Greg KH a écrit :
> On Tue, Dec 07, 2010 at 10:03:42PM -0500, Lee Schermerhorn wrote:
> >
> > I think this should be numa_mem_id(). Given the documented purpose of
> > slab_node(), we want a node from which page allocation is likely to
> > succeed. numa_node_id() can return a memoryless node for, e.g., some
> > configurations of some HP ia64 platforms. numa_mem_id() was introduced
> > to return that same node from which "local" mempolicy would allocate
> > pages.
>
> So should the upstream patch be changed?

We certainly can convert most numa_node_id() calls to numa_mem_id()
ones, but it wont be backported to 2.6.32, 2.6.34 & 2.6.27 :)

2010-12-08 05:07:12

by Eric Dumazet

[permalink] [raw]
Subject: Re: [06/44] numa: fix slab_node(MPOL_BIND)

Le mercredi 08 décembre 2010 à 05:33 +0100, Eric Dumazet a écrit :

> By the way, anybody knows how I can emulate a memoryless node on a dual
> node x86_64 machine (with memory present on both nodes) ?
>
>

this hack works for me :

diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
index a35cb9d..1087333 100644
--- a/arch/x86/mm/srat_64.c
+++ b/arch/x86/mm/srat_64.c
@@ -261,6 +261,7 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
end = start + ma->length;
pxm = ma->proximity_domain;
node = setup_node(pxm);
+ node = 0;
if (node < 0) {
printk(KERN_ERR "SRAT: Too many proximity domains.\n");
bad_srat();

[ 0.000000] SRAT: PXM 0 -> APIC 0x00 -> Node 0
[ 0.000000] SRAT: PXM 0 -> APIC 0x01 -> Node 0
[ 0.000000] SRAT: PXM 0 -> APIC 0x02 -> Node 0
[ 0.000000] SRAT: PXM 0 -> APIC 0x03 -> Node 0
[ 0.000000] SRAT: PXM 0 -> APIC 0x04 -> Node 0
[ 0.000000] SRAT: PXM 0 -> APIC 0x05 -> Node 0
[ 0.000000] SRAT: PXM 0 -> APIC 0x06 -> Node 0
[ 0.000000] SRAT: PXM 0 -> APIC 0x07 -> Node 0
[ 0.000000] SRAT: PXM 1 -> APIC 0x10 -> Node 1
[ 0.000000] SRAT: PXM 1 -> APIC 0x11 -> Node 1
[ 0.000000] SRAT: PXM 1 -> APIC 0x12 -> Node 1
[ 0.000000] SRAT: PXM 1 -> APIC 0x13 -> Node 1
[ 0.000000] SRAT: PXM 1 -> APIC 0x14 -> Node 1
[ 0.000000] SRAT: PXM 1 -> APIC 0x15 -> Node 1
[ 0.000000] SRAT: PXM 1 -> APIC 0x16 -> Node 1
[ 0.000000] SRAT: PXM 1 -> APIC 0x17 -> Node 1
[ 0.000000] SRAT: Node 0 PXM 0 0-80000000
[ 0.000000] SRAT: Node 0 PXM 1 80000000-e0000000
[ 0.000000] SRAT: Node 0 PXM 1 100000000-120000000
[ 0.000000] SRAT: Node 0 [0,80000000) + [80000000,e0000000) -> [0,e0000000)
[ 0.000000] SRAT: Node 0 [0,e0000000) + [100000000,120000000) -> [0,120000000)
[ 0.000000] NUMA: Using 63 for the hash shift.

2010-12-08 05:26:45

by Greg KH

[permalink] [raw]
Subject: Re: [34/44] Limit sysctl_tcp_mem and sysctl_udp_mem initializers to prevent integer overflows.

On Tue, Dec 07, 2010 at 05:22:34PM -0800, Linus Torvalds wrote:
> On Tue, Dec 7, 2010 at 4:04 PM, Greg KH <[email protected]> wrote:
> >
> > From: Robin Holt <[email protected]>
> >
> > [ Problem was fixed differently upstream. -DaveM ]
>
> Gaah. I'd really like to see more of a description for things like
> this. A commit ID for the alternate fix, or at least a few words about
> the different fix or reason why upstream doesn't need the stable
> commit.

I'll let David confirm this, he's the one who sent it to me :)

thanks,

greg k-h

2010-12-08 05:51:00

by Eric Dumazet

[permalink] [raw]
Subject: Re: [34/44] Limit sysctl_tcp_mem and sysctl_udp_mem initializers to prevent integer overflows.

Le mardi 07 décembre 2010 à 20:16 -0800, Greg KH a écrit :
> On Tue, Dec 07, 2010 at 05:22:34PM -0800, Linus Torvalds wrote:
> > On Tue, Dec 7, 2010 at 4:04 PM, Greg KH <[email protected]> wrote:
> > >
> > > From: Robin Holt <[email protected]>
> > >
> > > [ Problem was fixed differently upstream. -DaveM ]
> >
> > Gaah. I'd really like to see more of a description for things like
> > this. A commit ID for the alternate fix, or at least a few words about
> > the different fix or reason why upstream doesn't need the stable
> > commit.
>
> I'll let David confirm this, he's the one who sent it to me :)

upstream uses commit 8d987e5c7510 (net: avoid limits overflow)

This commit is a bit more untrusive for stable kernels :

It depends on :
a9febbb4bd13 (sysctl: min/max bounds are optional)
27b3d80a7b6a (sysctl: fix min/max handling in __do_proc_doulongvec_minmax())



2010-12-08 13:52:44

by Lee Schermerhorn

[permalink] [raw]
Subject: Re: [06/44] numa: fix slab_node(MPOL_BIND)

On Wed, 2010-12-08 at 05:33 +0100, Eric Dumazet wrote:
> Le mardi 07 décembre 2010 à 22:03 -0500, Lee Schermerhorn a écrit :
> > On Tue, 2010-12-07 at 16:04 -0800, Greg KH wrote:
> > > 2.6.27-stable review patch. If anyone has any objections, please let us know.
> > >
> > > ------------------
> > >
> > > From: Eric Dumazet <[email protected]>
> > >
> > > commit 800416f799e0723635ac2d720ad4449917a1481c upstream.
> > >
>
> > >
> > > --- a/mm/mempolicy.c
> > > +++ b/mm/mempolicy.c
> > > @@ -1404,7 +1404,7 @@ unsigned slab_node(struct mempolicy *pol
> > > (void)first_zones_zonelist(zonelist, highest_zoneidx,
> > > &policy->v.nodes,
> > > &zone);
> > > - return zone->node;
> > > + return zone ? zone->node : numa_node_id();
> >
> > I think this should be numa_mem_id(). Given the documented purpose of
> > slab_node(), we want a node from which page allocation is likely to
> > succeed. numa_node_id() can return a memoryless node for, e.g., some
> > configurations of some HP ia64 platforms. numa_mem_id() was introduced
> > to return that same node from which "local" mempolicy would allocate
> > pages.
>
> Hmm... numa_mem_id() was introduced in 2.6.35 as an optimization.
>
> When I did this patch (to fix a bug), mm/mempolicy.c only contained
> calls to numa_node_id() (and still is today)

Sometimes you want numa_node_id()--e.g., for use with a mempolicy-based
allocation that allows fallback. When the node id will be used for a
'_THIS_NODE allocation, numa_mem_id() is preferred as it will always
return a node that contains or contained--maybe now oom--memory. It's
the same as numa_node_id() on platforms that don't expose memoryless
nodes.

>
> By the way, anybody knows how I can emulate a memoryless node on a dual
> node x86_64 machine (with memory present on both nodes) ?
>

You can use the mem= boot parameter and specify the amount of memory on
the 1st/boot node. Or you can use the memmap parameter to reserve the
memory on the 2nd/non-boot node. With the memmap parameter, you can
reserve the memory of nodes other than the highest numbered
one[s]--e.g., on a >2 node platform. However, you'll probably a patch
to see the cpus on any node that you hide using memmap. I have such a
patch if you're interested in going that route.

You can also reduce the amount of memory on any/each node by reserving
ranges of physical memory with memmap. Use the 'SRAT.*PXM' boot
messages to find the nodes' physical memory ranges and reserve how ever
much you want off the top of the nodes.

Lee


2010-12-08 13:54:01

by Lee Schermerhorn

[permalink] [raw]
Subject: Re: [06/44] numa: fix slab_node(MPOL_BIND)

On Tue, 2010-12-07 at 20:17 -0800, Greg KH wrote:
> On Tue, Dec 07, 2010 at 10:03:42PM -0500, Lee Schermerhorn wrote:
> > On Tue, 2010-12-07 at 16:04 -0800, Greg KH wrote:
> > > 2.6.27-stable review patch. If anyone has any objections, please let us know.
> > >
> > > ------------------
> > >
> > > From: Eric Dumazet <[email protected]>
> > >
> > > commit 800416f799e0723635ac2d720ad4449917a1481c upstream.
> > >
> > > When a node contains only HighMem memory, slab_node(MPOL_BIND)
> > > dereferences a NULL pointer.
> > >
> > > [ This code seems to go back all the way to commit 19770b32609b: "mm:
> > > filter based on a nodemask as well as a gfp_mask". Which was back in
> > > April 2008, and it got merged into 2.6.26. - Linus ]
> > >
> > > Signed-off-by: Eric Dumazet <[email protected]>
> > > Cc: Mel Gorman <[email protected]>
> > > Cc: Christoph Lameter <[email protected]>
> > > Cc: Lee Schermerhorn <[email protected]>
> > > Cc: Andrew Morton <[email protected]>
> > > Signed-off-by: Linus Torvalds <[email protected]>
> > > Signed-off-by: Greg Kroah-Hartman <[email protected]>
> > >
> > > ---
> > > mm/mempolicy.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > --- a/mm/mempolicy.c
> > > +++ b/mm/mempolicy.c
> > > @@ -1404,7 +1404,7 @@ unsigned slab_node(struct mempolicy *pol
> > > (void)first_zones_zonelist(zonelist, highest_zoneidx,
> > > &policy->v.nodes,
> > > &zone);
> > > - return zone->node;
> > > + return zone ? zone->node : numa_node_id();
> >
> > I think this should be numa_mem_id(). Given the documented purpose of
> > slab_node(), we want a node from which page allocation is likely to
> > succeed. numa_node_id() can return a memoryless node for, e.g., some
> > configurations of some HP ia64 platforms. numa_mem_id() was introduced
> > to return that same node from which "local" mempolicy would allocate
> > pages.
>
> So should the upstream patch be changed?
>

Yeah, probably should. I didn't see it go by.

Lee

2010-12-08 16:24:57

by David Miller

[permalink] [raw]
Subject: Re: [34/44] Limit sysctl_tcp_mem and sysctl_udp_mem initializers to prevent integer overflows.

From: Eric Dumazet <[email protected]>
Date: Wed, 08 Dec 2010 06:50:45 +0100

> Le mardi 07 d?cembre 2010 ? 20:16 -0800, Greg KH a ?crit :
>> On Tue, Dec 07, 2010 at 05:22:34PM -0800, Linus Torvalds wrote:
>> > On Tue, Dec 7, 2010 at 4:04 PM, Greg KH <[email protected]> wrote:
>> > >
>> > > From: Robin Holt <[email protected]>
>> > >
>> > > [ Problem was fixed differently upstream. -DaveM ]
>> >
>> > Gaah. I'd really like to see more of a description for things like
>> > this. A commit ID for the alternate fix, or at least a few words about
>> > the different fix or reason why upstream doesn't need the stable
>> > commit.
>>
>> I'll let David confirm this, he's the one who sent it to me :)
>
> upstream uses commit 8d987e5c7510 (net: avoid limits overflow)
>
> This commit is a bit more untrusive for stable kernels :
>
> It depends on :
> a9febbb4bd13 (sysctl: min/max bounds are optional)
> 27b3d80a7b6a (sysctl: fix min/max handling in __do_proc_doulongvec_minmax())

Yep, this is the case. Greg, you can add a reference to:

a9febbb4bd13
27b3d80a7b6a
8d987e5c7510

in my "[ ... ]" in the commit message to clear this up.

2010-12-08 23:14:07

by Greg KH

[permalink] [raw]
Subject: Re: [34/44] Limit sysctl_tcp_mem and sysctl_udp_mem initializers to prevent integer overflows.

On Wed, Dec 08, 2010 at 08:25:22AM -0800, David Miller wrote:
> From: Eric Dumazet <[email protected]>
> Date: Wed, 08 Dec 2010 06:50:45 +0100
>
> > Le mardi 07 d?cembre 2010 ? 20:16 -0800, Greg KH a ?crit :
> >> On Tue, Dec 07, 2010 at 05:22:34PM -0800, Linus Torvalds wrote:
> >> > On Tue, Dec 7, 2010 at 4:04 PM, Greg KH <[email protected]> wrote:
> >> > >
> >> > > From: Robin Holt <[email protected]>
> >> > >
> >> > > [ Problem was fixed differently upstream. -DaveM ]
> >> >
> >> > Gaah. I'd really like to see more of a description for things like
> >> > this. A commit ID for the alternate fix, or at least a few words about
> >> > the different fix or reason why upstream doesn't need the stable
> >> > commit.
> >>
> >> I'll let David confirm this, he's the one who sent it to me :)
> >
> > upstream uses commit 8d987e5c7510 (net: avoid limits overflow)
> >
> > This commit is a bit more untrusive for stable kernels :
> >
> > It depends on :
> > a9febbb4bd13 (sysctl: min/max bounds are optional)
> > 27b3d80a7b6a (sysctl: fix min/max handling in __do_proc_doulongvec_minmax())
>
> Yep, this is the case. Greg, you can add a reference to:
>
> a9febbb4bd13
> 27b3d80a7b6a
> 8d987e5c7510
>
> in my "[ ... ]" in the commit message to clear this up.

Now added, thanks.

greg k-h