2013-04-29 19:11:36

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 00/30] 3.0.76-stable review

This is the start of the stable review cycle for the 3.0.76 release.
There are 30 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.

Responses should be made by Wed May 1 18:42:50 UTC 2013.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.0.76-rc1.gz
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <[email protected]>
Linux 3.0.76-rc1

Sam Ravnborg <[email protected]>
sparc32: support atomic64_t

Eric Dumazet <[email protected]>
net: drop dst before queueing fragments

Wei Yongjun <[email protected]>
netrom: fix invalid use of sizeof in nr_recvmsg()

Mathias Krause <[email protected]>
tipc: fix info leaks via msg_name in recv_msg/recv_stream

Mathias Krause <[email protected]>
rose: fix info leak via msg_name in rose_recvmsg()

Mathias Krause <[email protected]>
netrom: fix info leak via msg_name in nr_recvmsg()

Mathias Krause <[email protected]>
llc: Fix missing msg_namelen update in llc_ui_recvmsg()

Mathias Krause <[email protected]>
iucv: Fix missing msg_namelen update in iucv_sock_recvmsg()

Mathias Krause <[email protected]>
irda: Fix missing msg_namelen update in irda_recvmsg_dgram()

Mathias Krause <[email protected]>
caif: Fix missing msg_namelen update in caif_seqpkt_recvmsg()

Mathias Krause <[email protected]>
Bluetooth: RFCOMM - Fix missing msg_namelen update in rfcomm_sock_recvmsg()

Mathias Krause <[email protected]>
Bluetooth: fix possible info leak in bt_sock_recvmsg()

Mathias Krause <[email protected]>
ax25: fix info leak via msg_name in ax25_recvmsg()

Mathias Krause <[email protected]>
atm: update msg_namelen in vcc_recvmsg()

Linus Torvalds <[email protected]>
net: fix incorrect credentials passing

Eric Dumazet <[email protected]>
tcp: call tcp_replace_ts_recent() from tcp_ack()

Daniel Borkmann <[email protected]>
net: sctp: sctp_auth_key_put: use kzfree instead of kfree

Wei Yongjun <[email protected]>
esp4: fix error return code in esp_output()

Dmitry Popov <[email protected]>
tcp: incoming connections might use wrong route under synflood

Michael Riesch <[email protected]>
rtnetlink: Call nlmsg_parse() with correct header length

Patrick McHardy <[email protected]>
netfilter: don't reset nf_trace in nf_reset()

Eric W. Biederman <[email protected]>
af_unix: If we don't care about credentials coallesce all messages

[email protected] <[email protected]>
bonding: IFF_BONDING is not stripped on enslave failure

Hannes Frederic Sowa <[email protected]>
atl1e: limit gso segment size to prevent generation of wrong ip length fields

Vlad Yasevich <[email protected]>
net: count hw_addr syncs so that unsync works properly.

Balakumaran Kannan <[email protected]>
net IPv6 : Fix broken IPv6 routing table after loopback down-up

Vasily Averin <[email protected]>
cbq: incorrect processing of high limits

David S. Miller <[email protected]>
sparc64: Fix race in TLB batch processing.

Jiri Slaby <[email protected]>
TTY: fix atime/mtime regression

Jiri Slaby <[email protected]>
TTY: do not update atime/mtime on read/write


-------------

Diffstat:

Makefile | 4 +-
arch/sparc/Kconfig | 1 +
arch/sparc/include/asm/atomic_32.h | 2 +
arch/sparc/include/asm/pgtable_64.h | 1 +
arch/sparc/include/asm/system_64.h | 3 +-
arch/sparc/include/asm/tlbflush_64.h | 37 +++++++++--
arch/sparc/kernel/smp_64.c | 41 ++++++++++--
arch/sparc/mm/tlb.c | 39 ++++++++++--
arch/sparc/mm/tsb.c | 57 ++++++++++++-----
arch/sparc/mm/ultra.S | 119 ++++++++++++++++++++++++++++-------
drivers/net/atl1e/atl1e.h | 2 +-
drivers/net/atl1e/atl1e_main.c | 1 +
drivers/net/bonding/bond_main.c | 1 +
drivers/tty/tty_io.c | 14 ++++-
include/linux/netdevice.h | 2 +-
include/linux/skbuff.h | 8 +++
include/linux/socket.h | 3 +-
include/net/scm.h | 2 +-
net/atm/common.c | 2 +
net/ax25/af_ax25.c | 1 +
net/bluetooth/af_bluetooth.c | 4 +-
net/bluetooth/rfcomm/sock.c | 1 +
net/caif/caif_socket.c | 2 +
net/core/dev.c | 1 +
net/core/dev_addr_lists.c | 6 +-
net/core/rtnetlink.c | 4 +-
net/core/sock.c | 14 +++--
net/ipv4/esp4.c | 6 +-
net/ipv4/ip_fragment.c | 15 +++--
net/ipv4/syncookies.c | 4 +-
net/ipv4/tcp_input.c | 65 ++++++++++---------
net/ipv6/addrconf.c | 27 ++++++++
net/ipv6/reassembly.c | 13 +++-
net/irda/af_irda.c | 2 +
net/iucv/af_iucv.c | 2 +
net/llc/af_llc.c | 2 +
net/netrom/af_netrom.c | 1 +
net/rose/af_rose.c | 1 +
net/sched/sch_cbq.c | 5 +-
net/sctp/auth.c | 2 +-
net/tipc/socket.c | 7 +++
net/unix/af_unix.c | 2 +-
42 files changed, 405 insertions(+), 121 deletions(-)


2013-04-29 19:03:36

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 15/30] tcp: call tcp_replace_ts_recent() from tcp_ack()

3.0-stable review patch. If anyone has any objections, please let me know.

------------------


From: Eric Dumazet <[email protected]>

[ Upstream commit 12fb3dd9dc3c64ba7d64cec977cca9b5fb7b1d4e ]

commit bd090dfc634d (tcp: tcp_replace_ts_recent() should not be called
from tcp_validate_incoming()) introduced a TS ecr bug in slow path
processing.

1 A > B P. 1:10001(10000) ack 1 <nop,nop,TS val 1001 ecr 200>
2 B < A . 1:1(0) ack 1 win 257 <sack 9001:10001,TS val 300 ecr 1001>
3 A > B . 1:1001(1000) ack 1 win 227 <nop,nop,TS val 1002 ecr 200>
4 A > B . 1001:2001(1000) ack 1 win 227 <nop,nop,TS val 1002 ecr 200>

(ecr 200 should be ecr 300 in packets 3 & 4)

Problem is tcp_ack() can trigger send of new packets (retransmits),
reflecting the prior TSval, instead of the TSval contained in the
currently processed incoming packet.

Fix this by calling tcp_replace_ts_recent() from tcp_ack() after the
checks, but before the actions.

Reported-by: Yuchung Cheng <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Neal Cardwell <[email protected]>
Acked-by: Neal Cardwell <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/tcp_input.c | 65 +++++++++++++++++++++++++--------------------------
1 file changed, 32 insertions(+), 33 deletions(-)

--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -115,6 +115,7 @@ int sysctl_tcp_abc __read_mostly;
#define FLAG_DSACKING_ACK 0x800 /* SACK blocks contained D-SACK info */
#define FLAG_NONHEAD_RETRANS_ACKED 0x1000 /* Non-head rexmitted data was ACKed */
#define FLAG_SACK_RENEGING 0x2000 /* snd_una advanced to a sacked seq */
+#define FLAG_UPDATE_TS_RECENT 0x4000 /* tcp_replace_ts_recent() */

#define FLAG_ACKED (FLAG_DATA_ACKED|FLAG_SYN_ACKED)
#define FLAG_NOT_DUP (FLAG_DATA|FLAG_WIN_UPDATE|FLAG_ACKED)
@@ -3656,6 +3657,27 @@ static void tcp_send_challenge_ack(struc
}
}

+static void tcp_store_ts_recent(struct tcp_sock *tp)
+{
+ tp->rx_opt.ts_recent = tp->rx_opt.rcv_tsval;
+ tp->rx_opt.ts_recent_stamp = get_seconds();
+}
+
+static void tcp_replace_ts_recent(struct tcp_sock *tp, u32 seq)
+{
+ if (tp->rx_opt.saw_tstamp && !after(seq, tp->rcv_wup)) {
+ /* PAWS bug workaround wrt. ACK frames, the PAWS discard
+ * extra check below makes sure this can only happen
+ * for pure ACK frames. -DaveM
+ *
+ * Not only, also it occurs for expired timestamps.
+ */
+
+ if (tcp_paws_check(&tp->rx_opt, 0))
+ tcp_store_ts_recent(tp);
+ }
+}
+
/* This routine deals with incoming acks, but not outgoing ones. */
static int tcp_ack(struct sock *sk, struct sk_buff *skb, int flag)
{
@@ -3702,6 +3724,12 @@ static int tcp_ack(struct sock *sk, stru
prior_fackets = tp->fackets_out;
prior_in_flight = tcp_packets_in_flight(tp);

+ /* ts_recent update must be made after we are sure that the packet
+ * is in window.
+ */
+ if (flag & FLAG_UPDATE_TS_RECENT)
+ tcp_replace_ts_recent(tp, TCP_SKB_CB(skb)->seq);
+
if (!(flag & FLAG_SLOWPATH) && after(ack, prior_snd_una)) {
/* Window is constant, pure forward advance.
* No more checks are required.
@@ -3988,27 +4016,6 @@ u8 *tcp_parse_md5sig_option(struct tcphd
EXPORT_SYMBOL(tcp_parse_md5sig_option);
#endif

-static inline void tcp_store_ts_recent(struct tcp_sock *tp)
-{
- tp->rx_opt.ts_recent = tp->rx_opt.rcv_tsval;
- tp->rx_opt.ts_recent_stamp = get_seconds();
-}
-
-static inline void tcp_replace_ts_recent(struct tcp_sock *tp, u32 seq)
-{
- if (tp->rx_opt.saw_tstamp && !after(seq, tp->rcv_wup)) {
- /* PAWS bug workaround wrt. ACK frames, the PAWS discard
- * extra check below makes sure this can only happen
- * for pure ACK frames. -DaveM
- *
- * Not only, also it occurs for expired timestamps.
- */
-
- if (tcp_paws_check(&tp->rx_opt, 0))
- tcp_store_ts_recent(tp);
- }
-}
-
/* Sorry, PAWS as specified is broken wrt. pure-ACKs -DaveM
*
* It is not fatal. If this ACK does _not_ change critical state (seqs, window)
@@ -5477,14 +5484,10 @@ slow_path:
return 0;

step5:
- if (th->ack && tcp_ack(sk, skb, FLAG_SLOWPATH) < 0)
+ if (th->ack &&
+ tcp_ack(sk, skb, FLAG_SLOWPATH | FLAG_UPDATE_TS_RECENT) < 0)
goto discard;

- /* ts_recent update must be made after we are sure that the packet
- * is in window.
- */
- tcp_replace_ts_recent(tp, TCP_SKB_CB(skb)->seq);
-
tcp_rcv_rtt_measure_ts(sk, skb);

/* Process urgent data. */
@@ -5848,7 +5851,8 @@ int tcp_rcv_state_process(struct sock *s

/* step 5: check the ACK field */
if (th->ack) {
- int acceptable = tcp_ack(sk, skb, FLAG_SLOWPATH) > 0;
+ int acceptable = tcp_ack(sk, skb, FLAG_SLOWPATH |
+ FLAG_UPDATE_TS_RECENT) > 0;

switch (sk->sk_state) {
case TCP_SYN_RECV:
@@ -5961,11 +5965,6 @@ int tcp_rcv_state_process(struct sock *s
} else
goto discard;

- /* ts_recent update must be made after we are sure that the packet
- * is in window.
- */
- tcp_replace_ts_recent(tp, TCP_SKB_CB(skb)->seq);
-
/* step 6: check the URG bit */
tcp_urg(sk, skb, th);


2013-04-29 19:03:41

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 19/30] Bluetooth: fix possible info leak in bt_sock_recvmsg()

3.0-stable review patch. If anyone has any objections, please let me know.

------------------


From: Mathias Krause <[email protected]>

[ Upstream commit 4683f42fde3977bdb4e8a09622788cc8b5313778 ]

In case the socket is already shutting down, bt_sock_recvmsg() returns
with 0 without updating msg_namelen leading to net/socket.c leaking the
local, uninitialized sockaddr_storage variable to userland -- 128 bytes
of kernel stack memory.

Fix this by moving the msg_namelen assignment in front of the shutdown
test.

Signed-off-by: Mathias Krause <[email protected]>
Cc: Marcel Holtmann <[email protected]>
Cc: Gustavo Padovan <[email protected]>
Cc: Johan Hedberg <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/bluetooth/af_bluetooth.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/net/bluetooth/af_bluetooth.c
+++ b/net/bluetooth/af_bluetooth.c
@@ -245,6 +245,8 @@ int bt_sock_recvmsg(struct kiocb *iocb,
if (flags & (MSG_OOB))
return -EOPNOTSUPP;

+ msg->msg_namelen = 0;
+
skb = skb_recv_datagram(sk, flags, noblock, &err);
if (!skb) {
if (sk->sk_shutdown & RCV_SHUTDOWN)
@@ -252,8 +254,6 @@ int bt_sock_recvmsg(struct kiocb *iocb,
return err;
}

- msg->msg_namelen = 0;
-
copied = skb->len;
if (len < copied) {
msg->msg_flags |= MSG_TRUNC;

2013-04-29 19:03:46

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 30/30] sparc32: support atomic64_t

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Sam Ravnborg <[email protected]>

commit aea1181b0bd0a09c54546399768f359d1e198e45 upstream, Needed to
compile ext4 for sparc32 since commit
503f4bdcc078e7abee273a85ce322de81b18a224

There is no-one that really require atomic64_t support on sparc32.
But several drivers fails to build without proper atomic64 support.
And for an allyesconfig build for sparc32 this is annoying.

Include the generic atomic64_t support for sparc32.
This has a text footprint cost:

$size vmlinux (before atomic64_t support)
text data bss dec hex filename
3578860 134260 108781 3821901 3a514d vmlinux

$size vmlinux (after atomic64_t support)
text data bss dec hex filename
3579892 130684 108781 3819357 3a475d vmlinux

text increase (3579892 - 3578860) = 1032 bytes

data decreases - but I fail to explain why!
I have rebuild twice to check my numbers.

Signed-off-by: Sam Ravnborg <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Andreas Larsson <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/sparc/Kconfig | 1 +
arch/sparc/include/asm/atomic_32.h | 2 ++
2 files changed, 3 insertions(+)

--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -31,6 +31,7 @@ config SPARC

config SPARC32
def_bool !64BIT
+ select GENERIC_ATOMIC64

config SPARC64
def_bool 64BIT
--- a/arch/sparc/include/asm/atomic_32.h
+++ b/arch/sparc/include/asm/atomic_32.h
@@ -15,6 +15,8 @@

#ifdef __KERNEL__

+#include <asm-generic/atomic64.h>
+
#include <asm/system.h>

#define ATOMIC_INIT(i) { (i) }

2013-04-29 19:03:44

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 22/30] irda: Fix missing msg_namelen update in irda_recvmsg_dgram()

3.0-stable review patch. If anyone has any objections, please let me know.

------------------


From: Mathias Krause <[email protected]>

[ Upstream commit 5ae94c0d2f0bed41d6718be743985d61b7f5c47d ]

The current code does not fill the msg_name member in case it is set.
It also does not set the msg_namelen member to 0 and therefore makes
net/socket.c leak the local, uninitialized sockaddr_storage variable
to userland -- 128 bytes of kernel stack memory.

Fix that by simply setting msg_namelen to 0 as obviously nobody cared
about irda_recvmsg_dgram() not filling the msg_name in case it was
set.

Signed-off-by: Mathias Krause <[email protected]>
Cc: Samuel Ortiz <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/irda/af_irda.c | 2 ++
1 file changed, 2 insertions(+)

--- a/net/irda/af_irda.c
+++ b/net/irda/af_irda.c
@@ -1386,6 +1386,8 @@ static int irda_recvmsg_dgram(struct kio

IRDA_DEBUG(4, "%s()\n", __func__);

+ msg->msg_namelen = 0;
+
skb = skb_recv_datagram(sk, flags & ~MSG_DONTWAIT,
flags & MSG_DONTWAIT, &err);
if (!skb)

2013-04-29 19:04:21

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 29/30] net: drop dst before queueing fragments

3.0-stable review patch. If anyone has any objections, please let me know.

------------------


From: Eric Dumazet <[email protected]>

[ Upstream commit 97599dc792b45b1669c3cdb9a4b365aad0232f65 ]

Commit 4a94445c9a5c (net: Use ip_route_input_noref() in input path)
added a bug in IP defragmentation handling, as non refcounted
dst could escape an RCU protected section.

Commit 64f3b9e203bd068 (net: ip_expire() must revalidate route) fixed
the case of timeouts, but not the general problem.

Tom Parkin noticed crashes in UDP stack and provided a patch,
but further analysis permitted us to pinpoint the root cause.

Before queueing a packet into a frag list, we must drop its dst,
as this dst has limited lifetime (RCU protected)

When/if a packet is finally reassembled, we use the dst of the very
last skb, still protected by RCU and valid, as the dst of the
reassembled packet.

Use same logic in IPv6, as there is no need to hold dst references.

Reported-by: Tom Parkin <[email protected]>
Tested-by: Tom Parkin <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/ip_fragment.c | 15 +++++++++++----
net/ipv6/reassembly.c | 13 +++++++++++--
2 files changed, 22 insertions(+), 6 deletions(-)

--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -251,8 +251,7 @@ static void ip_expire(unsigned long arg)
if (!head->dev)
goto out_rcu_unlock;

- /* skb dst is stale, drop it, and perform route lookup again */
- skb_dst_drop(head);
+ /* skb has no dst, perform route lookup again */
iph = ip_hdr(head);
err = ip_route_input_noref(head, iph->daddr, iph->saddr,
iph->tos, head->dev);
@@ -517,8 +516,16 @@ found:
qp->q.last_in |= INET_FRAG_FIRST_IN;

if (qp->q.last_in == (INET_FRAG_FIRST_IN | INET_FRAG_LAST_IN) &&
- qp->q.meat == qp->q.len)
- return ip_frag_reasm(qp, prev, dev);
+ qp->q.meat == qp->q.len) {
+ unsigned long orefdst = skb->_skb_refdst;
+
+ skb->_skb_refdst = 0UL;
+ err = ip_frag_reasm(qp, prev, dev);
+ skb->_skb_refdst = orefdst;
+ return err;
+ }
+
+ skb_dst_drop(skb);

write_lock(&ip4_frags.lock);
list_move_tail(&qp->q.lru_list, &qp->q.net->lru_list);
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -385,8 +385,17 @@ found:
}

if (fq->q.last_in == (INET_FRAG_FIRST_IN | INET_FRAG_LAST_IN) &&
- fq->q.meat == fq->q.len)
- return ip6_frag_reasm(fq, prev, dev);
+ fq->q.meat == fq->q.len) {
+ int res;
+ unsigned long orefdst = skb->_skb_refdst;
+
+ skb->_skb_refdst = 0UL;
+ res = ip6_frag_reasm(fq, prev, dev);
+ skb->_skb_refdst = orefdst;
+ return res;
+ }
+
+ skb_dst_drop(skb);

write_lock(&ip6_frags.lock);
list_move_tail(&fq->q.lru_list, &fq->q.net->lru_list);

2013-04-29 19:04:40

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 28/30] netrom: fix invalid use of sizeof in nr_recvmsg()

3.0-stable review patch. If anyone has any objections, please let me know.

------------------


From: Wei Yongjun <[email protected]>

[ Upstream commit c802d759623acbd6e1ee9fbdabae89159a513913 ]

sizeof() when applied to a pointer typed expression gives the size of the
pointer, not that of the pointed data.
Introduced by commit 3ce5ef(netrom: fix info leak via msg_name in nr_recvmsg)

Signed-off-by: Wei Yongjun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/netrom/af_netrom.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/netrom/af_netrom.c
+++ b/net/netrom/af_netrom.c
@@ -1178,7 +1178,7 @@ static int nr_recvmsg(struct kiocb *iocb
}

if (sax != NULL) {
- memset(sax, 0, sizeof(sax));
+ memset(sax, 0, sizeof(*sax));
sax->sax25_family = AF_NETROM;
skb_copy_from_linear_data_offset(skb, 7, sax->sax25_call.ax25_call,
AX25_ADDR_LEN);

2013-04-29 19:04:57

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 27/30] tipc: fix info leaks via msg_name in recv_msg/recv_stream

3.0-stable review patch. If anyone has any objections, please let me know.

------------------


From: Mathias Krause <[email protected]>

[ Upstream commit 60085c3d009b0df252547adb336d1ccca5ce52ec ]

The code in set_orig_addr() does not initialize all of the members of
struct sockaddr_tipc when filling the sockaddr info -- namely the union
is only partly filled. This will make recv_msg() and recv_stream() --
the only users of this function -- leak kernel stack memory as the
msg_name member is a local variable in net/socket.c.

Additionally to that both recv_msg() and recv_stream() fail to update
the msg_namelen member to 0 while otherwise returning with 0, i.e.
"success". This is the case for, e.g., non-blocking sockets. This will
lead to a 128 byte kernel stack leak in net/socket.c.

Fix the first issue by initializing the memory of the union with
memset(0). Fix the second one by setting msg_namelen to 0 early as it
will be updated later if we're going to fill the msg_name member.

Signed-off-by: Mathias Krause <[email protected]>
Cc: Jon Maloy <[email protected]>
Cc: Allan Stephens <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/tipc/socket.c | 7 +++++++
1 file changed, 7 insertions(+)

--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -829,6 +829,7 @@ static void set_orig_addr(struct msghdr
if (addr) {
addr->family = AF_TIPC;
addr->addrtype = TIPC_ADDR_ID;
+ memset(&addr->addr, 0, sizeof(addr->addr));
addr->addr.id.ref = msg_origport(msg);
addr->addr.id.node = msg_orignode(msg);
addr->addr.name.domain = 0; /* could leave uninitialized */
@@ -948,6 +949,9 @@ static int recv_msg(struct kiocb *iocb,
goto exit;
}

+ /* will be updated in set_orig_addr() if needed */
+ m->msg_namelen = 0;
+
timeout = sock_rcvtimeo(sk, flags & MSG_DONTWAIT);
restart:

@@ -1074,6 +1078,9 @@ static int recv_stream(struct kiocb *ioc
goto exit;
}

+ /* will be updated in set_orig_addr() if needed */
+ m->msg_namelen = 0;
+
target = sock_rcvlowat(sk, flags & MSG_WAITALL, buf_len);
timeout = sock_rcvtimeo(sk, flags & MSG_DONTWAIT);
restart:

2013-04-29 19:03:39

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 17/30] atm: update msg_namelen in vcc_recvmsg()

3.0-stable review patch. If anyone has any objections, please let me know.

------------------


From: Mathias Krause <[email protected]>

[ Upstream commit 9b3e617f3df53822345a8573b6d358f6b9e5ed87 ]

The current code does not fill the msg_name member in case it is set.
It also does not set the msg_namelen member to 0 and therefore makes
net/socket.c leak the local, uninitialized sockaddr_storage variable
to userland -- 128 bytes of kernel stack memory.

Fix that by simply setting msg_namelen to 0 as obviously nobody cared
about vcc_recvmsg() not filling the msg_name in case it was set.

Signed-off-by: Mathias Krause <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/atm/common.c | 2 ++
1 file changed, 2 insertions(+)

--- a/net/atm/common.c
+++ b/net/atm/common.c
@@ -500,6 +500,8 @@ int vcc_recvmsg(struct kiocb *iocb, stru
struct sk_buff *skb;
int copied, error = -EINVAL;

+ msg->msg_namelen = 0;
+
if (sock->state != SS_CONNECTED)
return -ENOTCONN;
if (flags & ~MSG_DONTWAIT) /* only handle MSG_DONTWAIT */

2013-04-29 19:05:19

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 26/30] rose: fix info leak via msg_name in rose_recvmsg()

3.0-stable review patch. If anyone has any objections, please let me know.

------------------


From: Mathias Krause <[email protected]>

[ Upstream commit 4a184233f21645cf0b719366210ed445d1024d72 ]

The code in rose_recvmsg() does not initialize all of the members of
struct sockaddr_rose/full_sockaddr_rose when filling the sockaddr info.
Nor does it initialize the padding bytes of the structure inserted by
the compiler for alignment. This will lead to leaking uninitialized
kernel stack bytes in net/socket.c.

Fix the issue by initializing the memory used for sockaddr info with
memset(0).

Signed-off-by: Mathias Krause <[email protected]>
Cc: Ralf Baechle <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/rose/af_rose.c | 1 +
1 file changed, 1 insertion(+)

--- a/net/rose/af_rose.c
+++ b/net/rose/af_rose.c
@@ -1258,6 +1258,7 @@ static int rose_recvmsg(struct kiocb *io
skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);

if (srose != NULL) {
+ memset(srose, 0, msg->msg_namelen);
srose->srose_family = AF_ROSE;
srose->srose_addr = rose->dest_addr;
srose->srose_call = rose->dest_call;

2013-04-29 19:05:47

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 25/30] netrom: fix info leak via msg_name in nr_recvmsg()

3.0-stable review patch. If anyone has any objections, please let me know.

------------------


From: Mathias Krause <[email protected]>

[ Upstream commits 3ce5efad47b62c57a4f5c54248347085a750ce0e and
c802d759623acbd6e1ee9fbdabae89159a513913 ]

In case msg_name is set the sockaddr info gets filled out, as
requested, but the code fails to initialize the padding bytes of
struct sockaddr_ax25 inserted by the compiler for alignment. Also
the sax25_ndigis member does not get assigned, leaking four more
bytes.

Both issues lead to the fact that the code will leak uninitialized
kernel stack bytes in net/socket.c.

Fix both issues by initializing the memory with memset(0).

Signed-off-by: Mathias Krause <[email protected]>
Cc: Ralf Baechle <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/netrom/af_netrom.c | 1 +
1 file changed, 1 insertion(+)

--- a/net/netrom/af_netrom.c
+++ b/net/netrom/af_netrom.c
@@ -1178,6 +1178,7 @@ static int nr_recvmsg(struct kiocb *iocb
}

if (sax != NULL) {
+ memset(sax, 0, sizeof(sax));
sax->sax25_family = AF_NETROM;
skb_copy_from_linear_data_offset(skb, 7, sax->sax25_call.ax25_call,
AX25_ADDR_LEN);

2013-04-29 19:06:13

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 23/30] iucv: Fix missing msg_namelen update in iucv_sock_recvmsg()

3.0-stable review patch. If anyone has any objections, please let me know.

------------------


From: Mathias Krause <[email protected]>

[ Upstream commit a5598bd9c087dc0efc250a5221e5d0e6f584ee88 ]

The current code does not fill the msg_name member in case it is set.
It also does not set the msg_namelen member to 0 and therefore makes
net/socket.c leak the local, uninitialized sockaddr_storage variable
to userland -- 128 bytes of kernel stack memory.

Fix that by simply setting msg_namelen to 0 as obviously nobody cared
about iucv_sock_recvmsg() not filling the msg_name in case it was set.

Signed-off-by: Mathias Krause <[email protected]>
Cc: Ursula Braun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/iucv/af_iucv.c | 2 ++
1 file changed, 2 insertions(+)

--- a/net/iucv/af_iucv.c
+++ b/net/iucv/af_iucv.c
@@ -1157,6 +1157,8 @@ static int iucv_sock_recvmsg(struct kioc
struct sk_buff *skb, *rskb, *cskb;
int err = 0;

+ msg->msg_namelen = 0;
+
if ((sk->sk_state == IUCV_DISCONN || sk->sk_state == IUCV_SEVERED) &&
skb_queue_empty(&iucv->backlog_skb_q) &&
skb_queue_empty(&sk->sk_receive_queue) &&

2013-04-29 19:06:10

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 24/30] llc: Fix missing msg_namelen update in llc_ui_recvmsg()

3.0-stable review patch. If anyone has any objections, please let me know.

------------------


From: Mathias Krause <[email protected]>

[ Upstream commit c77a4b9cffb6215a15196ec499490d116dfad181 ]

For stream sockets the code misses to update the msg_namelen member
to 0 and therefore makes net/socket.c leak the local, uninitialized
sockaddr_storage variable to userland -- 128 bytes of kernel stack
memory. The msg_namelen update is also missing for datagram sockets
in case the socket is shutting down during receive.

Fix both issues by setting msg_namelen to 0 early. It will be
updated later if we're going to fill the msg_name member.

Signed-off-by: Mathias Krause <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/llc/af_llc.c | 2 ++
1 file changed, 2 insertions(+)

--- a/net/llc/af_llc.c
+++ b/net/llc/af_llc.c
@@ -720,6 +720,8 @@ static int llc_ui_recvmsg(struct kiocb *
int target; /* Read at least this many bytes */
long timeo;

+ msg->msg_namelen = 0;
+
lock_sock(sk);
copied = -ENOTCONN;
if (unlikely(sk->sk_type == SOCK_STREAM && sk->sk_state == TCP_LISTEN))

2013-04-29 19:06:46

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 13/30] esp4: fix error return code in esp_output()

3.0-stable review patch. If anyone has any objections, please let me know.

------------------


From: Wei Yongjun <[email protected]>

[ Upstream commit 06848c10f720cbc20e3b784c0df24930b7304b93 ]

Fix to return a negative error code from the error handling
case instead of 0, as returned elsewhere in this function.

Signed-off-by: Wei Yongjun <[email protected]>
Acked-by: Steffen Klassert <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/esp4.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

--- a/net/ipv4/esp4.c
+++ b/net/ipv4/esp4.c
@@ -137,8 +137,6 @@ static int esp_output(struct xfrm_state

/* skb is pure payload to encrypt */

- err = -ENOMEM;
-
esp = x->data;
aead = esp->aead;
alen = crypto_aead_authsize(aead);
@@ -174,8 +172,10 @@ static int esp_output(struct xfrm_state
}

tmp = esp_alloc_tmp(aead, nfrags + sglists, seqhilen);
- if (!tmp)
+ if (!tmp) {
+ err = -ENOMEM;
goto error;
+ }

seqhi = esp_tmp_seqhi(tmp);
iv = esp_tmp_iv(aead, tmp, seqhilen);

2013-04-29 19:07:03

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 21/30] caif: Fix missing msg_namelen update in caif_seqpkt_recvmsg()

3.0-stable review patch. If anyone has any objections, please let me know.

------------------


From: Mathias Krause <[email protected]>

[ Upstream commit 2d6fbfe733f35c6b355c216644e08e149c61b271 ]

The current code does not fill the msg_name member in case it is set.
It also does not set the msg_namelen member to 0 and therefore makes
net/socket.c leak the local, uninitialized sockaddr_storage variable
to userland -- 128 bytes of kernel stack memory.

Fix that by simply setting msg_namelen to 0 as obviously nobody cared
about caif_seqpkt_recvmsg() not filling the msg_name in case it was
set.

Signed-off-by: Mathias Krause <[email protected]>
Cc: Sjur Braendeland <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/caif/caif_socket.c | 2 ++
1 file changed, 2 insertions(+)

--- a/net/caif/caif_socket.c
+++ b/net/caif/caif_socket.c
@@ -320,6 +320,8 @@ static int caif_seqpkt_recvmsg(struct ki
if (m->msg_flags&MSG_OOB)
goto read_error;

+ m->msg_namelen = 0;
+
skb = skb_recv_datagram(sk, flags, 0 , &ret);
if (!skb)
goto read_error;

2013-04-29 19:07:34

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 20/30] Bluetooth: RFCOMM - Fix missing msg_namelen update in rfcomm_sock_recvmsg()

3.0-stable review patch. If anyone has any objections, please let me know.

------------------


From: Mathias Krause <[email protected]>

[ Upstream commit e11e0455c0d7d3d62276a0c55d9dfbc16779d691 ]

If RFCOMM_DEFER_SETUP is set in the flags, rfcomm_sock_recvmsg() returns
early with 0 without updating the possibly set msg_namelen member. This,
in turn, leads to a 128 byte kernel stack leak in net/socket.c.

Fix this by updating msg_namelen in this case. For all other cases it
will be handled in bt_sock_stream_recvmsg().

Signed-off-by: Mathias Krause <[email protected]>
Cc: Marcel Holtmann <[email protected]>
Cc: Gustavo Padovan <[email protected]>
Cc: Johan Hedberg <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/bluetooth/rfcomm/sock.c | 1 +
1 file changed, 1 insertion(+)

--- a/net/bluetooth/rfcomm/sock.c
+++ b/net/bluetooth/rfcomm/sock.c
@@ -624,6 +624,7 @@ static int rfcomm_sock_recvmsg(struct ki

if (test_and_clear_bit(RFCOMM_DEFER_SETUP, &d->flags)) {
rfcomm_dlc_accept(d);
+ msg->msg_namelen = 0;
return 0;
}


2013-04-29 19:03:33

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 09/30] af_unix: If we dont care about credentials coallesce all messages

3.0-stable review patch. If anyone has any objections, please let me know.

------------------


From: "Eric W. Biederman" <[email protected]>

[ Upstream commit 0e82e7f6dfeec1013339612f74abc2cdd29d43d2 ]

It was reported that the following LSB test case failed
https://lsbbugs.linuxfoundation.org/attachment.cgi?id=2144 because we
were not coallescing unix stream messages when the application was
expecting us to.

The problem was that the first send was before the socket was accepted
and thus sock->sk_socket was NULL in maybe_add_creds, and the second
send after the socket was accepted had a non-NULL value for sk->socket
and thus we could tell the credentials were not needed so we did not
bother.

The unnecessary credentials on the first message cause
unix_stream_recvmsg to start verifying that all messages had the same
credentials before coallescing and then the coallescing failed because
the second message had no credentials.

Ignoring credentials when we don't care in unix_stream_recvmsg fixes a
long standing pessimization which would fail to coallesce messages when
reading from a unix stream socket if the senders were different even if
we did not care about their credentials.

I have tested this and verified that the in the LSB test case mentioned
above that the messages do coallesce now, while the were failing to
coallesce without this change.

Reported-by: Karel Srot <[email protected]>
Reported-by: Ding Tianhong <[email protected]>
Signed-off-by: "Eric W. Biederman" <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/unix/af_unix.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1940,7 +1940,7 @@ static int unix_stream_recvmsg(struct ki
skb_queue_head(&sk->sk_receive_queue, skb);
break;
}
- } else {
+ } else if (test_bit(SOCK_PASSCRED, &sock->flags)) {
/* Copy credentials */
scm_set_cred(siocb->scm, UNIXCB(skb).pid, UNIXCB(skb).cred);
check_creds = 1;

2013-04-29 19:07:53

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 18/30] ax25: fix info leak via msg_name in ax25_recvmsg()

3.0-stable review patch. If anyone has any objections, please let me know.

------------------


From: Mathias Krause <[email protected]>

[ Upstream commit ef3313e84acbf349caecae942ab3ab731471f1a1 ]

When msg_namelen is non-zero the sockaddr info gets filled out, as
requested, but the code fails to initialize the padding bytes of struct
sockaddr_ax25 inserted by the compiler for alignment. Additionally the
msg_namelen value is updated to sizeof(struct full_sockaddr_ax25) but is
not always filled up to this size.

Both issues lead to the fact that the code will leak uninitialized
kernel stack bytes in net/socket.c.

Fix both issues by initializing the memory with memset(0).

Signed-off-by: Mathias Krause <[email protected]>
Cc: Ralf Baechle <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ax25/af_ax25.c | 1 +
1 file changed, 1 insertion(+)

--- a/net/ax25/af_ax25.c
+++ b/net/ax25/af_ax25.c
@@ -1641,6 +1641,7 @@ static int ax25_recvmsg(struct kiocb *io
ax25_address src;
const unsigned char *mac = skb_mac_header(skb);

+ memset(sax, 0, sizeof(struct full_sockaddr_ax25));
ax25_addr_parse(mac + 1, skb->data - mac - 1, &src, NULL,
&digi, NULL, NULL);
sax->sax25_family = AF_AX25;

2013-04-29 19:08:15

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 16/30] net: fix incorrect credentials passing

3.0-stable review patch. If anyone has any objections, please let me know.

------------------


From: Linus Torvalds <[email protected]>

[ Upstream commit 83f1b4ba917db5dc5a061a44b3403ddb6e783494 ]

Commit 257b5358b32f ("scm: Capture the full credentials of the scm
sender") changed the credentials passing code to pass in the effective
uid/gid instead of the real uid/gid.

Obviously this doesn't matter most of the time (since normally they are
the same), but it results in differences for suid binaries when the wrong
uid/gid ends up being used.

This just undoes that (presumably unintentional) part of the commit.

Reported-by: Andy Lutomirski <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>
Acked-by: "Eric W. Biederman" <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/linux/socket.h | 3 ++-
include/net/scm.h | 2 +-
net/core/sock.c | 14 ++++++++++----
3 files changed, 13 insertions(+), 6 deletions(-)

--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -312,7 +312,8 @@ struct ucred {
/* IPX options */
#define IPX_TYPE 1

-extern void cred_to_ucred(struct pid *pid, const struct cred *cred, struct ucred *ucred);
+extern void cred_to_ucred(struct pid *pid, const struct cred *cred, struct ucred *ucred,
+ bool use_effective);

extern int memcpy_fromiovec(unsigned char *kdata, struct iovec *iov, int len);
extern int memcpy_fromiovecend(unsigned char *kdata, const struct iovec *iov,
--- a/include/net/scm.h
+++ b/include/net/scm.h
@@ -50,7 +50,7 @@ static __inline__ void scm_set_cred(stru
{
scm->pid = get_pid(pid);
scm->cred = get_cred(cred);
- cred_to_ucred(pid, cred, &scm->creds);
+ cred_to_ucred(pid, cred, &scm->creds, false);
}

static __inline__ void scm_destroy_cred(struct scm_cookie *scm)
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -752,15 +752,20 @@ EXPORT_SYMBOL(sock_setsockopt);


void cred_to_ucred(struct pid *pid, const struct cred *cred,
- struct ucred *ucred)
+ struct ucred *ucred, bool use_effective)
{
ucred->pid = pid_vnr(pid);
ucred->uid = ucred->gid = -1;
if (cred) {
struct user_namespace *current_ns = current_user_ns();

- ucred->uid = user_ns_map_uid(current_ns, cred, cred->euid);
- ucred->gid = user_ns_map_gid(current_ns, cred, cred->egid);
+ if (use_effective) {
+ ucred->uid = user_ns_map_uid(current_ns, cred, cred->euid);
+ ucred->gid = user_ns_map_gid(current_ns, cred, cred->egid);
+ } else {
+ ucred->uid = user_ns_map_uid(current_ns, cred, cred->uid);
+ ucred->gid = user_ns_map_gid(current_ns, cred, cred->gid);
+ }
}
}
EXPORT_SYMBOL_GPL(cred_to_ucred);
@@ -921,7 +926,8 @@ int sock_getsockopt(struct socket *sock,
struct ucred peercred;
if (len > sizeof(peercred))
len = sizeof(peercred);
- cred_to_ucred(sk->sk_peer_pid, sk->sk_peer_cred, &peercred);
+ cred_to_ucred(sk->sk_peer_pid, sk->sk_peer_cred,
+ &peercred, true);
if (copy_to_user(optval, &peercred, len))
return -EFAULT;
goto lenout;

2013-04-29 19:08:34

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 14/30] net: sctp: sctp_auth_key_put: use kzfree instead of kfree

3.0-stable review patch. If anyone has any objections, please let me know.

------------------


From: Daniel Borkmann <[email protected]>

[ Upstream commit 586c31f3bf04c290dc0a0de7fc91d20aa9a5ee53 ]

For sensitive data like keying material, it is common practice to zero
out keys before returning the memory back to the allocator. Thus, use
kzfree instead of kfree.

Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Neil Horman <[email protected]>
Acked-by: Vlad Yasevich <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/sctp/auth.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/sctp/auth.c
+++ b/net/sctp/auth.c
@@ -71,7 +71,7 @@ void sctp_auth_key_put(struct sctp_auth_
return;

if (atomic_dec_and_test(&key->refcnt)) {
- kfree(key);
+ kzfree(key);
SCTP_DBG_OBJCNT_DEC(keys);
}
}

2013-04-29 19:03:30

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 05/30] net IPv6 : Fix broken IPv6 routing table after loopback down-up

3.0-stable review patch. If anyone has any objections, please let me know.

------------------


From: Balakumaran Kannan <[email protected]>

[ Upstream commit 25fb6ca4ed9cad72f14f61629b68dc03c0d9713f ]

IPv6 Routing table becomes broken once we do ifdown, ifup of the loopback(lo)
interface. After down-up, routes of other interface's IPv6 addresses through
'lo' are lost.

IPv6 addresses assigned to all interfaces are routed through 'lo' for internal
communication. Once 'lo' is down, those routing entries are removed from routing
table. But those removed entries are not being re-created properly when 'lo' is
brought up. So IPv6 addresses of other interfaces becomes unreachable from the
same machine. Also this breaks communication with other machines because of
NDISC packet processing failure.

This patch fixes this issue by reading all interface's IPv6 addresses and adding
them to IPv6 routing table while bringing up 'lo'.

==Testing==
Before applying the patch:
$ route -A inet6
Kernel IPv6 routing table
Destination Next Hop Flag Met Ref Use If
2000::20/128 :: U 256 0 0 eth0
fe80::/64 :: U 256 0 0 eth0
::/0 :: !n -1 1 1 lo
::1/128 :: Un 0 1 0 lo
2000::20/128 :: Un 0 1 0 lo
fe80::xxxx:xxxx:xxxx:xxxx/128 :: Un 0 1 0 lo
ff00::/8 :: U 256 0 0 eth0
::/0 :: !n -1 1 1 lo
$ sudo ifdown lo
$ sudo ifup lo
$ route -A inet6
Kernel IPv6 routing table
Destination Next Hop Flag Met Ref Use If
2000::20/128 :: U 256 0 0 eth0
fe80::/64 :: U 256 0 0 eth0
::/0 :: !n -1 1 1 lo
::1/128 :: Un 0 1 0 lo
ff00::/8 :: U 256 0 0 eth0
::/0 :: !n -1 1 1 lo
$

After applying the patch:
$ route -A inet6
Kernel IPv6 routing
table
Destination Next Hop Flag Met Ref Use If
2000::20/128 :: U 256 0 0 eth0
fe80::/64 :: U 256 0 0 eth0
::/0 :: !n -1 1 1 lo
::1/128 :: Un 0 1 0 lo
2000::20/128 :: Un 0 1 0 lo
fe80::xxxx:xxxx:xxxx:xxxx/128 :: Un 0 1 0 lo
ff00::/8 :: U 256 0 0 eth0
::/0 :: !n -1 1 1 lo
$ sudo ifdown lo
$ sudo ifup lo
$ route -A inet6
Kernel IPv6 routing table
Destination Next Hop Flag Met Ref Use If
2000::20/128 :: U 256 0 0 eth0
fe80::/64 :: U 256 0 0 eth0
::/0 :: !n -1 1 1 lo
::1/128 :: Un 0 1 0 lo
2000::20/128 :: Un 0 1 0 lo
fe80::xxxx:xxxx:xxxx:xxxx/128 :: Un 0 1 0 lo
ff00::/8 :: U 256 0 0 eth0
::/0 :: !n -1 1 1 lo
$

Signed-off-by: Balakumaran Kannan <[email protected]>
Signed-off-by: Maruthi Thotad <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv6/addrconf.c | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)

--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -2327,6 +2327,9 @@ static void sit_add_v4_addrs(struct inet
static void init_loopback(struct net_device *dev)
{
struct inet6_dev *idev;
+ struct net_device *sp_dev;
+ struct inet6_ifaddr *sp_ifa;
+ struct rt6_info *sp_rt;

/* ::1 */

@@ -2338,6 +2341,30 @@ static void init_loopback(struct net_dev
}

add_addr(idev, &in6addr_loopback, 128, IFA_HOST);
+
+ /* Add routes to other interface's IPv6 addresses */
+ for_each_netdev(dev_net(dev), sp_dev) {
+ if (!strcmp(sp_dev->name, dev->name))
+ continue;
+
+ idev = __in6_dev_get(sp_dev);
+ if (!idev)
+ continue;
+
+ read_lock_bh(&idev->lock);
+ list_for_each_entry(sp_ifa, &idev->addr_list, if_list) {
+
+ if (sp_ifa->flags & (IFA_F_DADFAILED | IFA_F_TENTATIVE))
+ continue;
+
+ sp_rt = addrconf_dst_alloc(idev, &sp_ifa->addr, 0);
+
+ /* Failure cases are ignored */
+ if (!IS_ERR(sp_rt))
+ ip6_ins_rt(sp_rt);
+ }
+ read_unlock_bh(&idev->lock);
+ }
}

static void addrconf_add_linklocal(struct inet6_dev *idev, const struct in6_addr *addr)

2013-04-29 19:08:50

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 12/30] tcp: incoming connections might use wrong route under synflood

3.0-stable review patch. If anyone has any objections, please let me know.

------------------


From: Dmitry Popov <[email protected]>

[ Upstream commit d66954a066158781ccf9c13c91d0316970fe57b6 ]

There is a bug in cookie_v4_check (net/ipv4/syncookies.c):
flowi4_init_output(&fl4, 0, sk->sk_mark, RT_CONN_FLAGS(sk),
RT_SCOPE_UNIVERSE, IPPROTO_TCP,
inet_sk_flowi_flags(sk),
(opt && opt->srr) ? opt->faddr : ireq->rmt_addr,
ireq->loc_addr, th->source, th->dest);

Here we do not respect sk->sk_bound_dev_if, therefore wrong dst_entry may be
taken. This dst_entry is used by new socket (get_cookie_sock ->
tcp_v4_syn_recv_sock), so its packets may take the wrong path.

Signed-off-by: Dmitry Popov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/syncookies.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -345,8 +345,8 @@ struct sock *cookie_v4_check(struct sock
* hasn't changed since we received the original syn, but I see
* no easy way to do this.
*/
- flowi4_init_output(&fl4, 0, sk->sk_mark, RT_CONN_FLAGS(sk),
- RT_SCOPE_UNIVERSE, IPPROTO_TCP,
+ flowi4_init_output(&fl4, sk->sk_bound_dev_if, sk->sk_mark,
+ RT_CONN_FLAGS(sk), RT_SCOPE_UNIVERSE, IPPROTO_TCP,
inet_sk_flowi_flags(sk),
(opt && opt->srr) ? opt->faddr : ireq->rmt_addr,
ireq->loc_addr, th->source, th->dest);

2013-04-29 19:09:12

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 11/30] rtnetlink: Call nlmsg_parse() with correct header length

3.0-stable review patch. If anyone has any objections, please let me know.

------------------


From: Michael Riesch <[email protected]>

[ Upstream commit 88c5b5ce5cb57af6ca2a7cf4d5715fa320448ff9 ]

Signed-off-by: Michael Riesch <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Jiri Benc <[email protected]>
Cc: "Theodore Ts'o" <[email protected]>
Cc: [email protected]
Acked-by: Mark Rustad <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/core/rtnetlink.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1045,7 +1045,7 @@ static int rtnl_dump_ifinfo(struct sk_bu

rcu_read_lock();

- if (nlmsg_parse(cb->nlh, sizeof(struct rtgenmsg), tb, IFLA_MAX,
+ if (nlmsg_parse(cb->nlh, sizeof(struct ifinfomsg), tb, IFLA_MAX,
ifla_policy) >= 0) {

if (tb[IFLA_EXT_MASK])
@@ -1876,7 +1876,7 @@ static u16 rtnl_calcit(struct sk_buff *s
u32 ext_filter_mask = 0;
u16 min_ifinfo_dump_size = 0;

- if (nlmsg_parse(nlh, sizeof(struct rtgenmsg), tb, IFLA_MAX,
+ if (nlmsg_parse(nlh, sizeof(struct ifinfomsg), tb, IFLA_MAX,
ifla_policy) >= 0) {
if (tb[IFLA_EXT_MASK])
ext_filter_mask = nla_get_u32(tb[IFLA_EXT_MASK]);

2013-04-29 19:09:31

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 10/30] netfilter: dont reset nf_trace in nf_reset()

3.0-stable review patch. If anyone has any objections, please let me know.

------------------


From: Patrick McHardy <[email protected]>

[ Upstream commit 124dff01afbdbff251f0385beca84ba1b9adda68 ]

Commit 130549fe ("netfilter: reset nf_trace in nf_reset") added code
to reset nf_trace in nf_reset(). This is wrong and unnecessary.

nf_reset() is used in the following cases:

- when passing packets up the the socket layer, at which point we want to
release all netfilter references that might keep modules pinned while
the packet is queued. nf_trace doesn't matter anymore at this point.

- when encapsulating or decapsulating IPsec packets. We want to continue
tracing these packets after IPsec processing.

- when passing packets through virtual network devices. Only devices on
that encapsulate in IPv4/v6 matter since otherwise nf_trace is not
used anymore. Its not entirely clear whether those packets should
be traced after that, however we've always done that.

- when passing packets through virtual network devices that make the
packet cross network namespace boundaries. This is the only cases
where we clearly want to reset nf_trace and is also what the
original patch intended to fix.

Add a new function nf_reset_trace() and use it in dev_forward_skb() to
fix this properly.

Signed-off-by: Patrick McHardy <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/linux/skbuff.h | 8 ++++++++
net/core/dev.c | 1 +
2 files changed, 9 insertions(+)

--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2135,6 +2135,14 @@ static inline void nf_reset(struct sk_bu
#endif
}

+static inline void nf_reset_trace(struct sk_buff *skb)
+{
+#if defined(CONFIG_NETFILTER_XT_TARGET_TRACE) || \
+ defined(CONFIG_NETFILTER_XT_TARGET_TRACE_MODULE)
+ skb->nf_trace = 0;
+#endif
+}
+
/* Note: This doesn't put any conntrack and bridge info in dst. */
static inline void __nf_copy(struct sk_buff *dst, const struct sk_buff *src)
{
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1545,6 +1545,7 @@ int dev_forward_skb(struct net_device *d
skb->mark = 0;
secpath_reset(skb);
nf_reset(skb);
+ nf_reset_trace(skb);
return netif_rx(skb);
}
EXPORT_SYMBOL_GPL(dev_forward_skb);

2013-04-29 19:09:48

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 08/30] bonding: IFF_BONDING is not stripped on enslave failure

3.0-stable review patch. If anyone has any objections, please let me know.

------------------


From: "[email protected]" <[email protected]>

[ Upstream commit b6a5a7b9a528a8b4c8bec940b607c5dd9102b8cc ]

While enslaving a new device and after IFF_BONDING flag is set, in case
of failure it is not stripped from the device's priv_flags while
cleaning up, which could lead to other problems.
Cleaning at err_close because the flag is set after dev_open().

v2: no change

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/bonding/bond_main.c | 1 +
1 file changed, 1 insertion(+)

--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1949,6 +1949,7 @@ err_detach:
write_unlock_bh(&bond->lock);

err_close:
+ slave_dev->priv_flags &= ~IFF_BONDING;
dev_close(slave_dev);

err_unset_master:

2013-04-29 19:10:32

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 07/30] atl1e: limit gso segment size to prevent generation of wrong ip length fields

3.0-stable review patch. If anyone has any objections, please let me know.

------------------


From: Hannes Frederic Sowa <[email protected]>

[ Upstream commit 31d1670e73f4911fe401273a8f576edc9c2b5fea ]

The limit of 0x3c00 is taken from the windows driver.

Suggested-by: Huang, Xiong <[email protected]>
Cc: Huang, Xiong <[email protected]>
Cc: Eric Dumazet <[email protected]>
Signed-off-by: Hannes Frederic Sowa <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/atl1e/atl1e.h | 2 +-
drivers/net/atl1e/atl1e_main.c | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/net/atl1e/atl1e.h
+++ b/drivers/net/atl1e/atl1e.h
@@ -186,7 +186,7 @@ struct atl1e_tpd_desc {
/* how about 0x2000 */
#define MAX_TX_BUF_LEN 0x2000
#define MAX_TX_BUF_SHIFT 13
-/*#define MAX_TX_BUF_LEN 0x3000 */
+#define MAX_TSO_SEG_SIZE 0x3c00

/* rrs word 1 bit 0:31 */
#define RRS_RX_CSUM_MASK 0xFFFF
--- a/drivers/net/atl1e/atl1e_main.c
+++ b/drivers/net/atl1e/atl1e_main.c
@@ -2333,6 +2333,7 @@ static int __devinit atl1e_probe(struct

INIT_WORK(&adapter->reset_task, atl1e_reset_task);
INIT_WORK(&adapter->link_chg_task, atl1e_link_chg_task);
+ netif_set_gso_max_size(netdev, MAX_TSO_SEG_SIZE);
err = register_netdev(netdev);
if (err) {
netdev_err(netdev, "register netdevice failed\n");

2013-04-29 19:10:53

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 04/30] cbq: incorrect processing of high limits

3.0-stable review patch. If anyone has any objections, please let me know.

------------------


From: Vasily Averin <[email protected]>

[ Upstream commit f0f6ee1f70c4eaab9d52cf7d255df4bd89f8d1c2 ]

currently cbq works incorrectly for limits > 10% real link bandwidth,
and practically does not work for limits > 50% real link bandwidth.
Below are results of experiments taken on 1 Gbit link

In shaper | Actual Result
-----------+---------------
100M | 108 Mbps
200M | 244 Mbps
300M | 412 Mbps
500M | 893 Mbps

This happen because of q->now changes incorrectly in cbq_dequeue():
when it is called before real end of packet transmitting,
L2T is greater than real time delay, q_now gets an extra boost
but never compensate it.

To fix this problem we prevent change of q->now until its synchronization
with real time.

Signed-off-by: Vasily Averin <[email protected]>
Reviewed-by: Alexey Kuznetsov <[email protected]>
Acked-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/sched/sch_cbq.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

--- a/net/sched/sch_cbq.c
+++ b/net/sched/sch_cbq.c
@@ -963,8 +963,11 @@ cbq_dequeue(struct Qdisc *sch)
cbq_update(q);
if ((incr -= incr2) < 0)
incr = 0;
+ q->now += incr;
+ } else {
+ if (now > q->now)
+ q->now = now;
}
- q->now += incr;
q->now_rt = now;

for (;;) {

2013-04-29 19:11:14

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 06/30] net: count hw_addr syncs so that unsync works properly.

3.0-stable review patch. If anyone has any objections, please let me know.

------------------


From: Vlad Yasevich <[email protected]>

[ Upstream commit 4543fbefe6e06a9e40d9f2b28d688393a299f079 ]

A few drivers use dev_uc_sync/unsync to synchronize the
address lists from master down to slave/lower devices. In
some cases (bond/team) a single address list is synched down
to multiple devices. At the time of unsync, we have a leak
in these lower devices, because "synced" is treated as a
boolean and the address will not be unsynced for anything after
the first device/call.

Treat "synced" as a count (same as refcount) and allow all
unsync calls to work.

Signed-off-by: Vlad Yasevich <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/linux/netdevice.h | 2 +-
net/core/dev_addr_lists.c | 6 +++---
2 files changed, 4 insertions(+), 4 deletions(-)

--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -231,9 +231,9 @@ struct netdev_hw_addr {
#define NETDEV_HW_ADDR_T_SLAVE 3
#define NETDEV_HW_ADDR_T_UNICAST 4
#define NETDEV_HW_ADDR_T_MULTICAST 5
- bool synced;
bool global_use;
int refcount;
+ int synced;
struct rcu_head rcu_head;
};

--- a/net/core/dev_addr_lists.c
+++ b/net/core/dev_addr_lists.c
@@ -56,7 +56,7 @@ static int __hw_addr_add_ex(struct netde
ha->type = addr_type;
ha->refcount = 1;
ha->global_use = global;
- ha->synced = false;
+ ha->synced = 0;
list_add_tail_rcu(&ha->list, &list->list);
list->count++;
return 0;
@@ -154,7 +154,7 @@ int __hw_addr_sync(struct netdev_hw_addr
addr_len, ha->type);
if (err)
break;
- ha->synced = true;
+ ha->synced++;
ha->refcount++;
} else if (ha->refcount == 1) {
__hw_addr_del(to_list, ha->addr, addr_len, ha->type);
@@ -175,7 +175,7 @@ void __hw_addr_unsync(struct netdev_hw_a
if (ha->synced) {
__hw_addr_del(to_list, ha->addr,
addr_len, ha->type);
- ha->synced = false;
+ ha->synced--;
__hw_addr_del(from_list, ha->addr,
addr_len, ha->type);
}

2013-04-29 19:11:52

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 03/30] sparc64: Fix race in TLB batch processing.

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: "David S. Miller" <[email protected]>

[ Commits f36391d2790d04993f48da6a45810033a2cdf847 and
f0af97070acbad5d6a361f485828223a4faaa0ee upstream. ]

As reported by Dave Kleikamp, when we emit cross calls to do batched
TLB flush processing we have a race because we do not synchronize on
the sibling cpus completing the cross call.

So meanwhile the TLB batch can be reset (tb->tlb_nr set to zero, etc.)
and either flushes are missed or flushes will flush the wrong
addresses.

Fix this by using generic infrastructure to synchonize on the
completion of the cross call.

This first required getting the flush_tlb_pending() call out from
switch_to() which operates with locks held and interrupts disabled.
The problem is that smp_call_function_many() cannot be invoked with
IRQs disabled and this is explicitly checked for with WARN_ON_ONCE().

We get the batch processing outside of locked IRQ disabled sections by
using some ideas from the powerpc port. Namely, we only batch inside
of arch_{enter,leave}_lazy_mmu_mode() calls. If we're not in such a
region, we flush TLBs synchronously.

1) Get rid of xcall_flush_tlb_pending and per-cpu type
implementations.

2) Do TLB batch cross calls instead via:

smp_call_function_many()
tlb_pending_func()
__flush_tlb_pending()

3) Batch only in lazy mmu sequences:

a) Add 'active' member to struct tlb_batch
b) Define __HAVE_ARCH_ENTER_LAZY_MMU_MODE
c) Set 'active' in arch_enter_lazy_mmu_mode()
d) Run batch and clear 'active' in arch_leave_lazy_mmu_mode()
e) Check 'active' in tlb_batch_add_one() and do a synchronous
flush if it's clear.

4) Add infrastructure for synchronous TLB page flushes.

a) Implement __flush_tlb_page and per-cpu variants, patch
as needed.
b) Likewise for xcall_flush_tlb_page.
c) Implement smp_flush_tlb_page() to invoke the cross-call.
d) Wire up global_flush_tlb_page() to the right routine based
upon CONFIG_SMP

5) It turns out that singleton batches are very common, 2 out of every
3 batch flushes have only a single entry in them.

The batch flush waiting is very expensive, both because of the poll
on sibling cpu completeion, as well as because passing the tlb batch
pointer to the sibling cpus invokes a shared memory dereference.

Therefore, in flush_tlb_pending(), if there is only one entry in
the batch perform a completely asynchronous global_flush_tlb_page()
instead.

Reported-by: Dave Kleikamp <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Acked-by: Dave Kleikamp <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/include/asm/pgtable_64.h | 1
arch/sparc/include/asm/system_64.h | 3
arch/sparc/include/asm/tlbflush_64.h | 37 +++++++++-
arch/sparc/kernel/smp_64.c | 41 ++++++++++--
arch/sparc/mm/tlb.c | 39 ++++++++++-
arch/sparc/mm/tsb.c | 57 ++++++++++++----
arch/sparc/mm/ultra.S | 119 +++++++++++++++++++++++++++--------
7 files changed, 242 insertions(+), 55 deletions(-)

--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -774,6 +774,7 @@ static inline int io_remap_pfn_range(str
return remap_pfn_range(vma, from, phys_base >> PAGE_SHIFT, size, prot);
}

+#include <asm/tlbflush.h>
#include <asm-generic/pgtable.h>

/* We provide our own get_unmapped_area to cope with VA holes and
--- a/arch/sparc/include/asm/system_64.h
+++ b/arch/sparc/include/asm/system_64.h
@@ -140,8 +140,7 @@ do { \
* and 2 stores in this critical code path. -DaveM
*/
#define switch_to(prev, next, last) \
-do { flush_tlb_pending(); \
- save_and_clear_fpu(); \
+do { save_and_clear_fpu(); \
/* If you are tempted to conditionalize the following */ \
/* so that ASI is only written if it changes, think again. */ \
__asm__ __volatile__("wr %%g0, %0, %%asi" \
--- a/arch/sparc/include/asm/tlbflush_64.h
+++ b/arch/sparc/include/asm/tlbflush_64.h
@@ -11,24 +11,40 @@
struct tlb_batch {
struct mm_struct *mm;
unsigned long tlb_nr;
+ unsigned long active;
unsigned long vaddrs[TLB_BATCH_NR];
};

extern void flush_tsb_kernel_range(unsigned long start, unsigned long end);
extern void flush_tsb_user(struct tlb_batch *tb);
+extern void flush_tsb_user_page(struct mm_struct *mm, unsigned long vaddr);

/* TLB flush operations. */

-extern void flush_tlb_pending(void);
+static inline void flush_tlb_mm(struct mm_struct *mm)
+{
+}
+
+static inline void flush_tlb_page(struct vm_area_struct *vma,
+ unsigned long vmaddr)
+{
+}
+
+static inline void flush_tlb_range(struct vm_area_struct *vma,
+ unsigned long start, unsigned long end)
+{
+}
+
+#define __HAVE_ARCH_ENTER_LAZY_MMU_MODE

-#define flush_tlb_range(vma,start,end) \
- do { (void)(start); flush_tlb_pending(); } while (0)
-#define flush_tlb_page(vma,addr) flush_tlb_pending()
-#define flush_tlb_mm(mm) flush_tlb_pending()
+extern void flush_tlb_pending(void);
+extern void arch_enter_lazy_mmu_mode(void);
+extern void arch_leave_lazy_mmu_mode(void);
+#define arch_flush_lazy_mmu_mode() do {} while (0)

/* Local cpu only. */
extern void __flush_tlb_all(void);
-
+extern void __flush_tlb_page(unsigned long context, unsigned long vaddr);
extern void __flush_tlb_kernel_range(unsigned long start, unsigned long end);

#ifndef CONFIG_SMP
@@ -38,15 +54,24 @@ do { flush_tsb_kernel_range(start,end);
__flush_tlb_kernel_range(start,end); \
} while (0)

+static inline void global_flush_tlb_page(struct mm_struct *mm, unsigned long vaddr)
+{
+ __flush_tlb_page(CTX_HWBITS(mm->context), vaddr);
+}
+
#else /* CONFIG_SMP */

extern void smp_flush_tlb_kernel_range(unsigned long start, unsigned long end);
+extern void smp_flush_tlb_page(struct mm_struct *mm, unsigned long vaddr);

#define flush_tlb_kernel_range(start, end) \
do { flush_tsb_kernel_range(start,end); \
smp_flush_tlb_kernel_range(start, end); \
} while (0)

+#define global_flush_tlb_page(mm, vaddr) \
+ smp_flush_tlb_page(mm, vaddr)
+
#endif /* ! CONFIG_SMP */

#endif /* _SPARC64_TLBFLUSH_H */
--- a/arch/sparc/kernel/smp_64.c
+++ b/arch/sparc/kernel/smp_64.c
@@ -856,7 +856,7 @@ void smp_tsb_sync(struct mm_struct *mm)
}

extern unsigned long xcall_flush_tlb_mm;
-extern unsigned long xcall_flush_tlb_pending;
+extern unsigned long xcall_flush_tlb_page;
extern unsigned long xcall_flush_tlb_kernel_range;
extern unsigned long xcall_fetch_glob_regs;
extern unsigned long xcall_receive_signal;
@@ -1070,22 +1070,55 @@ local_flush_and_out:
put_cpu();
}

+struct tlb_pending_info {
+ unsigned long ctx;
+ unsigned long nr;
+ unsigned long *vaddrs;
+};
+
+static void tlb_pending_func(void *info)
+{
+ struct tlb_pending_info *t = info;
+
+ __flush_tlb_pending(t->ctx, t->nr, t->vaddrs);
+}
+
void smp_flush_tlb_pending(struct mm_struct *mm, unsigned long nr, unsigned long *vaddrs)
{
u32 ctx = CTX_HWBITS(mm->context);
+ struct tlb_pending_info info;
int cpu = get_cpu();

+ info.ctx = ctx;
+ info.nr = nr;
+ info.vaddrs = vaddrs;
+
if (mm == current->mm && atomic_read(&mm->mm_users) == 1)
cpumask_copy(mm_cpumask(mm), cpumask_of(cpu));
else
- smp_cross_call_masked(&xcall_flush_tlb_pending,
- ctx, nr, (unsigned long) vaddrs,
- mm_cpumask(mm));
+ smp_call_function_many(mm_cpumask(mm), tlb_pending_func,
+ &info, 1);

__flush_tlb_pending(ctx, nr, vaddrs);

put_cpu();
}
+
+void smp_flush_tlb_page(struct mm_struct *mm, unsigned long vaddr)
+{
+ unsigned long context = CTX_HWBITS(mm->context);
+ int cpu = get_cpu();
+
+ if (mm == current->mm && atomic_read(&mm->mm_users) == 1)
+ cpumask_copy(mm_cpumask(mm), cpumask_of(cpu));
+ else
+ smp_cross_call_masked(&xcall_flush_tlb_page,
+ context, vaddr, 0,
+ mm_cpumask(mm));
+ __flush_tlb_page(context, vaddr);
+
+ put_cpu();
+}

void smp_flush_tlb_kernel_range(unsigned long start, unsigned long end)
{
--- a/arch/sparc/mm/tlb.c
+++ b/arch/sparc/mm/tlb.c
@@ -24,11 +24,17 @@ static DEFINE_PER_CPU(struct tlb_batch,
void flush_tlb_pending(void)
{
struct tlb_batch *tb = &get_cpu_var(tlb_batch);
+ struct mm_struct *mm = tb->mm;

- if (tb->tlb_nr) {
- flush_tsb_user(tb);
+ if (!tb->tlb_nr)
+ goto out;

- if (CTX_VALID(tb->mm->context)) {
+ flush_tsb_user(tb);
+
+ if (CTX_VALID(mm->context)) {
+ if (tb->tlb_nr == 1) {
+ global_flush_tlb_page(mm, tb->vaddrs[0]);
+ } else {
#ifdef CONFIG_SMP
smp_flush_tlb_pending(tb->mm, tb->tlb_nr,
&tb->vaddrs[0]);
@@ -37,12 +43,30 @@ void flush_tlb_pending(void)
tb->tlb_nr, &tb->vaddrs[0]);
#endif
}
- tb->tlb_nr = 0;
}

+ tb->tlb_nr = 0;
+
+out:
put_cpu_var(tlb_batch);
}

+void arch_enter_lazy_mmu_mode(void)
+{
+ struct tlb_batch *tb = &__get_cpu_var(tlb_batch);
+
+ tb->active = 1;
+}
+
+void arch_leave_lazy_mmu_mode(void)
+{
+ struct tlb_batch *tb = &__get_cpu_var(tlb_batch);
+
+ if (tb->tlb_nr)
+ flush_tlb_pending();
+ tb->active = 0;
+}
+
void tlb_batch_add(struct mm_struct *mm, unsigned long vaddr,
pte_t *ptep, pte_t orig, int fullmm)
{
@@ -90,6 +114,12 @@ no_cache_flush:
nr = 0;
}

+ if (!tb->active) {
+ global_flush_tlb_page(mm, vaddr);
+ flush_tsb_user_page(mm, vaddr);
+ goto out;
+ }
+
if (nr == 0)
tb->mm = mm;

@@ -98,5 +128,6 @@ no_cache_flush:
if (nr >= TLB_BATCH_NR)
flush_tlb_pending();

+out:
put_cpu_var(tlb_batch);
}
--- a/arch/sparc/mm/tsb.c
+++ b/arch/sparc/mm/tsb.c
@@ -8,11 +8,10 @@
#include <linux/slab.h>
#include <asm/system.h>
#include <asm/page.h>
-#include <asm/tlbflush.h>
-#include <asm/tlb.h>
-#include <asm/mmu_context.h>
#include <asm/pgtable.h>
+#include <asm/mmu_context.h>
#include <asm/tsb.h>
+#include <asm/tlb.h>
#include <asm/oplib.h>

extern struct tsb swapper_tsb[KERNEL_TSB_NENTRIES];
@@ -47,23 +46,27 @@ void flush_tsb_kernel_range(unsigned lon
}
}

-static void __flush_tsb_one(struct tlb_batch *tb, unsigned long hash_shift,
- unsigned long tsb, unsigned long nentries)
+static void __flush_tsb_one_entry(unsigned long tsb, unsigned long v,
+ unsigned long hash_shift,
+ unsigned long nentries)
{
- unsigned long i;
+ unsigned long tag, ent, hash;

- for (i = 0; i < tb->tlb_nr; i++) {
- unsigned long v = tb->vaddrs[i];
- unsigned long tag, ent, hash;
+ v &= ~0x1UL;
+ hash = tsb_hash(v, hash_shift, nentries);
+ ent = tsb + (hash * sizeof(struct tsb));
+ tag = (v >> 22UL);

- v &= ~0x1UL;
+ tsb_flush(ent, tag);
+}

- hash = tsb_hash(v, hash_shift, nentries);
- ent = tsb + (hash * sizeof(struct tsb));
- tag = (v >> 22UL);
+static void __flush_tsb_one(struct tlb_batch *tb, unsigned long hash_shift,
+ unsigned long tsb, unsigned long nentries)
+{
+ unsigned long i;

- tsb_flush(ent, tag);
- }
+ for (i = 0; i < tb->tlb_nr; i++)
+ __flush_tsb_one_entry(tsb, tb->vaddrs[i], hash_shift, nentries);
}

void flush_tsb_user(struct tlb_batch *tb)
@@ -89,6 +92,30 @@ void flush_tsb_user(struct tlb_batch *tb
}
#endif
spin_unlock_irqrestore(&mm->context.lock, flags);
+}
+
+void flush_tsb_user_page(struct mm_struct *mm, unsigned long vaddr)
+{
+ unsigned long nentries, base, flags;
+
+ spin_lock_irqsave(&mm->context.lock, flags);
+
+ base = (unsigned long) mm->context.tsb_block[MM_TSB_BASE].tsb;
+ nentries = mm->context.tsb_block[MM_TSB_BASE].tsb_nentries;
+ if (tlb_type == cheetah_plus || tlb_type == hypervisor)
+ base = __pa(base);
+ __flush_tsb_one_entry(base, vaddr, PAGE_SHIFT, nentries);
+
+#if defined(CONFIG_HUGETLB_PAGE) || defined(CONFIG_TRANSPARENT_HUGEPAGE)
+ if (mm->context.tsb_block[MM_TSB_HUGE].tsb) {
+ base = (unsigned long) mm->context.tsb_block[MM_TSB_HUGE].tsb;
+ nentries = mm->context.tsb_block[MM_TSB_HUGE].tsb_nentries;
+ if (tlb_type == cheetah_plus || tlb_type == hypervisor)
+ base = __pa(base);
+ __flush_tsb_one_entry(base, vaddr, HPAGE_SHIFT, nentries);
+ }
+#endif
+ spin_unlock_irqrestore(&mm->context.lock, flags);
}

#if defined(CONFIG_SPARC64_PAGE_SIZE_8KB)
--- a/arch/sparc/mm/ultra.S
+++ b/arch/sparc/mm/ultra.S
@@ -53,6 +53,33 @@ __flush_tlb_mm: /* 18 insns */
nop

.align 32
+ .globl __flush_tlb_page
+__flush_tlb_page: /* 22 insns */
+ /* %o0 = context, %o1 = vaddr */
+ rdpr %pstate, %g7
+ andn %g7, PSTATE_IE, %g2
+ wrpr %g2, %pstate
+ mov SECONDARY_CONTEXT, %o4
+ ldxa [%o4] ASI_DMMU, %g2
+ stxa %o0, [%o4] ASI_DMMU
+ andcc %o1, 1, %g0
+ andn %o1, 1, %o3
+ be,pn %icc, 1f
+ or %o3, 0x10, %o3
+ stxa %g0, [%o3] ASI_IMMU_DEMAP
+1: stxa %g0, [%o3] ASI_DMMU_DEMAP
+ membar #Sync
+ stxa %g2, [%o4] ASI_DMMU
+ sethi %hi(KERNBASE), %o4
+ flush %o4
+ retl
+ wrpr %g7, 0x0, %pstate
+ nop
+ nop
+ nop
+ nop
+
+ .align 32
.globl __flush_tlb_pending
__flush_tlb_pending: /* 26 insns */
/* %o0 = context, %o1 = nr, %o2 = vaddrs[] */
@@ -203,6 +230,31 @@ __cheetah_flush_tlb_mm: /* 19 insns */
retl
wrpr %g7, 0x0, %pstate

+__cheetah_flush_tlb_page: /* 22 insns */
+ /* %o0 = context, %o1 = vaddr */
+ rdpr %pstate, %g7
+ andn %g7, PSTATE_IE, %g2
+ wrpr %g2, 0x0, %pstate
+ wrpr %g0, 1, %tl
+ mov PRIMARY_CONTEXT, %o4
+ ldxa [%o4] ASI_DMMU, %g2
+ srlx %g2, CTX_PGSZ1_NUC_SHIFT, %o3
+ sllx %o3, CTX_PGSZ1_NUC_SHIFT, %o3
+ or %o0, %o3, %o0 /* Preserve nucleus page size fields */
+ stxa %o0, [%o4] ASI_DMMU
+ andcc %o1, 1, %g0
+ be,pn %icc, 1f
+ andn %o1, 1, %o3
+ stxa %g0, [%o3] ASI_IMMU_DEMAP
+1: stxa %g0, [%o3] ASI_DMMU_DEMAP
+ membar #Sync
+ stxa %g2, [%o4] ASI_DMMU
+ sethi %hi(KERNBASE), %o4
+ flush %o4
+ wrpr %g0, 0, %tl
+ retl
+ wrpr %g7, 0x0, %pstate
+
__cheetah_flush_tlb_pending: /* 27 insns */
/* %o0 = context, %o1 = nr, %o2 = vaddrs[] */
rdpr %pstate, %g7
@@ -269,6 +321,20 @@ __hypervisor_flush_tlb_mm: /* 10 insns *
retl
nop

+__hypervisor_flush_tlb_page: /* 11 insns */
+ /* %o0 = context, %o1 = vaddr */
+ mov %o0, %g2
+ mov %o1, %o0 /* ARG0: vaddr + IMMU-bit */
+ mov %g2, %o1 /* ARG1: mmu context */
+ mov HV_MMU_ALL, %o2 /* ARG2: flags */
+ srlx %o0, PAGE_SHIFT, %o0
+ sllx %o0, PAGE_SHIFT, %o0
+ ta HV_MMU_UNMAP_ADDR_TRAP
+ brnz,pn %o0, __hypervisor_tlb_tl0_error
+ mov HV_MMU_UNMAP_ADDR_TRAP, %o1
+ retl
+ nop
+
__hypervisor_flush_tlb_pending: /* 16 insns */
/* %o0 = context, %o1 = nr, %o2 = vaddrs[] */
sllx %o1, 3, %g1
@@ -339,6 +405,13 @@ cheetah_patch_cachetlbops:
call tlb_patch_one
mov 19, %o2

+ sethi %hi(__flush_tlb_page), %o0
+ or %o0, %lo(__flush_tlb_page), %o0
+ sethi %hi(__cheetah_flush_tlb_page), %o1
+ or %o1, %lo(__cheetah_flush_tlb_page), %o1
+ call tlb_patch_one
+ mov 22, %o2
+
sethi %hi(__flush_tlb_pending), %o0
or %o0, %lo(__flush_tlb_pending), %o0
sethi %hi(__cheetah_flush_tlb_pending), %o1
@@ -397,10 +470,9 @@ xcall_flush_tlb_mm: /* 21 insns */
nop
nop

- .globl xcall_flush_tlb_pending
-xcall_flush_tlb_pending: /* 21 insns */
- /* %g5=context, %g1=nr, %g7=vaddrs[] */
- sllx %g1, 3, %g1
+ .globl xcall_flush_tlb_page
+xcall_flush_tlb_page: /* 17 insns */
+ /* %g5=context, %g1=vaddr */
mov PRIMARY_CONTEXT, %g4
ldxa [%g4] ASI_DMMU, %g2
srlx %g2, CTX_PGSZ1_NUC_SHIFT, %g4
@@ -408,20 +480,16 @@ xcall_flush_tlb_pending: /* 21 insns */
or %g5, %g4, %g5
mov PRIMARY_CONTEXT, %g4
stxa %g5, [%g4] ASI_DMMU
-1: sub %g1, (1 << 3), %g1
- ldx [%g7 + %g1], %g5
- andcc %g5, 0x1, %g0
+ andcc %g1, 0x1, %g0
be,pn %icc, 2f
-
- andn %g5, 0x1, %g5
+ andn %g1, 0x1, %g5
stxa %g0, [%g5] ASI_IMMU_DEMAP
2: stxa %g0, [%g5] ASI_DMMU_DEMAP
membar #Sync
- brnz,pt %g1, 1b
- nop
stxa %g2, [%g4] ASI_DMMU
retry
nop
+ nop

.globl xcall_flush_tlb_kernel_range
xcall_flush_tlb_kernel_range: /* 25 insns */
@@ -596,15 +664,13 @@ __hypervisor_xcall_flush_tlb_mm: /* 21 i
membar #Sync
retry

- .globl __hypervisor_xcall_flush_tlb_pending
-__hypervisor_xcall_flush_tlb_pending: /* 21 insns */
- /* %g5=ctx, %g1=nr, %g7=vaddrs[], %g2,%g3,%g4,g6=scratch */
- sllx %g1, 3, %g1
+ .globl __hypervisor_xcall_flush_tlb_page
+__hypervisor_xcall_flush_tlb_page: /* 17 insns */
+ /* %g5=ctx, %g1=vaddr */
mov %o0, %g2
mov %o1, %g3
mov %o2, %g4
-1: sub %g1, (1 << 3), %g1
- ldx [%g7 + %g1], %o0 /* ARG0: virtual address */
+ mov %g1, %o0 /* ARG0: virtual address */
mov %g5, %o1 /* ARG1: mmu context */
mov HV_MMU_ALL, %o2 /* ARG2: flags */
srlx %o0, PAGE_SHIFT, %o0
@@ -613,8 +679,6 @@ __hypervisor_xcall_flush_tlb_pending: /*
mov HV_MMU_UNMAP_ADDR_TRAP, %g6
brnz,a,pn %o0, __hypervisor_tlb_xcall_error
mov %o0, %g5
- brnz,pt %g1, 1b
- nop
mov %g2, %o0
mov %g3, %o1
mov %g4, %o2
@@ -697,6 +761,13 @@ hypervisor_patch_cachetlbops:
call tlb_patch_one
mov 10, %o2

+ sethi %hi(__flush_tlb_page), %o0
+ or %o0, %lo(__flush_tlb_page), %o0
+ sethi %hi(__hypervisor_flush_tlb_page), %o1
+ or %o1, %lo(__hypervisor_flush_tlb_page), %o1
+ call tlb_patch_one
+ mov 11, %o2
+
sethi %hi(__flush_tlb_pending), %o0
or %o0, %lo(__flush_tlb_pending), %o0
sethi %hi(__hypervisor_flush_tlb_pending), %o1
@@ -728,12 +799,12 @@ hypervisor_patch_cachetlbops:
call tlb_patch_one
mov 21, %o2

- sethi %hi(xcall_flush_tlb_pending), %o0
- or %o0, %lo(xcall_flush_tlb_pending), %o0
- sethi %hi(__hypervisor_xcall_flush_tlb_pending), %o1
- or %o1, %lo(__hypervisor_xcall_flush_tlb_pending), %o1
+ sethi %hi(xcall_flush_tlb_page), %o0
+ or %o0, %lo(xcall_flush_tlb_page), %o0
+ sethi %hi(__hypervisor_xcall_flush_tlb_page), %o1
+ or %o1, %lo(__hypervisor_xcall_flush_tlb_page), %o1
call tlb_patch_one
- mov 21, %o2
+ mov 17, %o2

sethi %hi(xcall_flush_tlb_kernel_range), %o0
or %o0, %lo(xcall_flush_tlb_kernel_range), %o0

2013-04-29 19:12:11

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 02/30] TTY: fix atime/mtime regression

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jiri Slaby <[email protected]>

commit 37b7f3c76595e23257f61bd80b223de8658617ee upstream.

In commit b0de59b5733d ("TTY: do not update atime/mtime on read/write")
we removed timestamps from tty inodes to fix a security issue and waited
if something breaks. Well, 'w', the utility to find out logged users
and their inactivity time broke. It shows that users are inactive since
the time they logged in.

To revert to the old behaviour while still preventing attackers to
guess the password length, we update the timestamps in one-minute
intervals by this patch.

Signed-off-by: Jiri Slaby <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/tty/tty_io.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)

--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -939,6 +939,14 @@ void start_tty(struct tty_struct *tty)

EXPORT_SYMBOL(start_tty);

+static void tty_update_time(struct timespec *time)
+{
+ unsigned long sec = get_seconds();
+ sec -= sec % 60;
+ if ((long)(sec - time->tv_sec) > 0)
+ time->tv_sec = sec;
+}
+
/**
* tty_read - read method for tty device files
* @file: pointer to tty file
@@ -976,6 +984,9 @@ static ssize_t tty_read(struct file *fil
i = -EIO;
tty_ldisc_deref(ld);

+ if (i > 0)
+ tty_update_time(&inode->i_atime);
+
return i;
}

@@ -1076,8 +1087,11 @@ static inline ssize_t do_tty_write(
break;
cond_resched();
}
- if (written)
+ if (written) {
+ struct inode *inode = file->f_path.dentry->d_inode;
+ tty_update_time(&inode->i_mtime);
ret = written;
+ }
out:
tty_write_unlock(tty);
return ret;

2013-04-29 19:12:35

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 01/30] TTY: do not update atime/mtime on read/write

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jiri Slaby <[email protected]>

commit b0de59b5733d18b0d1974a060860a8b5c1b36a2e upstream.

On http://vladz.devzero.fr/013_ptmx-timing.php, we can see how to find
out length of a password using timestamps of /dev/ptmx. It is
documented in "Timing Analysis of Keystrokes and Timing Attacks on
SSH". To avoid that problem, do not update time when reading
from/writing to a TTY.

I am afraid of regressions as this is a behavior we have since 0.97
and apps may expect the time to be current, e.g. for monitoring
whether there was a change on the TTY. Now, there is no change. So
this would better have a lot of testing before it goes upstream.

References: CVE-2013-0160

Signed-off-by: Jiri Slaby <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/tty/tty_io.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)

--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -975,8 +975,7 @@ static ssize_t tty_read(struct file *fil
else
i = -EIO;
tty_ldisc_deref(ld);
- if (i > 0)
- inode->i_atime = current_fs_time(inode->i_sb);
+
return i;
}

@@ -1077,11 +1076,8 @@ static inline ssize_t do_tty_write(
break;
cond_resched();
}
- if (written) {
- struct inode *inode = file->f_path.dentry->d_inode;
- inode->i_mtime = current_fs_time(inode->i_sb);
+ if (written)
ret = written;
- }
out:
tty_write_unlock(tty);
return ret;

2013-04-30 01:53:26

by Shuah Khan

[permalink] [raw]
Subject: Re: [ 00/30] 3.0.76-stable review

On Mon, 2013-04-29 at 12:02 -0700, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 3.0.76 release.
> There are 30 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed May 1 18:42:50 UTC 2013.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.0.76-rc1.gz
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>

Patches applied cleanly to 3.0.75, 3.4.42, and 3.8.10

Compiled and booted on the following systems:

Samsung Series 9 Intel Corei5 (3.4.43-rc1 and 3.8.11-rc1)
HP ProBook 6475b AMD A10-4600M APU with Radeon(tm) HD Graphics

dmesgs for all releases look good. No regressions compared to the
previous dmesgs for each of these releases.

Cross-compile tests results:

alpha: defconfig passed on all
arm: defconfig passed on all
arm64: not applicable to 3.0.y, 3.4.y. defconfig passed on 3.8.y
c6x: not applicable to 3.0.y, defconfig passed on 3.4.y, and 3.8.y.
mips: defconfig passed on all
mipsel: defconfig passed on all
powerpc: wii_defconfig passed on all
sh: defconfig passed on all
sparc: defconfig passed on all
tile: tilegx_defconfig passed on all

-- Shuah

????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?