To mitigate future end-of-data sequence attacks, like SMTP Smuggling, MTAs
should comply with RFC 5321 section 4.1.1.4 [0] to strip control
characters other than <SP>, <HT>, <CR>, and <LF> in the DATA section of
SMTP messages.
> 4.1.1.4. DATA (DATA)
>
> The receiver normally sends a 354 response to DATA, and then treats
> the lines (strings ending in <CRLF> sequences, as described in
> Section 2.3.7) following the command as mail data from the sender.
> This command causes the mail data to be appended to the mail data
> buffer. The mail data may contain any of the 128 ASCII character
> codes, although experience has indicated that use of control
> characters other than SP, HT, CR, and LF may cause problems and
> SHOULD be avoided when possible.
e.g., `\r\n\x00.\r\n` _SHOULD_ become `\r\n.\r\n` and then (as per RFC
5321 section 4.5.2 [1]) dot-stuff the _forbidden_ sequences.
As per RFC 2119 section 3 [2], the word *SHOULD* implies *MUST* unless you
have a valid reason not to--which is never the case for these _forbidden_
sequences in DATA. This is why RFC 5321 4.1.1.4's _SHOULD avoid_ implies
_needs to strip_.
Also note that RFC 5321 section 3.6.3 [3] and section 6.4 [4] do not give
the OK to send along NUL or other control characters. These sections are
about _adding_ missing information, not preserving messages with
potentially damaging garbage.
Cheers to Pete Resnick for this clarification and explanation of RFC 5321.
This particular issue was first noted in SEC Consult's analysis of SMTP
Smuggling [5]:
> During the research we've also discovered some exotic inbound SMTP
> servers that interpret end-of-data sequences like
> <CR><LF>\x00.<CR><LF>, with "\x00" representing a null byte. With
> proprietary SMTP components and lots of different e-mail services
> intertwined it's hard to tell what is possible until an e-mail reaches
> its final destination.
>
> Even though SMTP smuggling might still be hiding in some places, we
> hopefully eliminated some big targets.
Stripping NUL and other control characters could have unforeseen
consequences. MTAs which errantly rely on non-compliant control characters
would break. Major MTAs are therefore sensibly resistant to enforcing RFC
5321 section 4.1.1.4.
What is the real world HAM:SPAM ratio of emails which include NUL? Would
it be safe to configure sendmail to `O RejectNUL=True` (which would break
RFC 2822 section 4 [6] by rejecting email which include NUL)?
What are the benefits and risks of stripping ASCII NUL and other control
characters from SMTP DATA?
Feedback appreciated,
Mark Esler and Bastien Roucari?s
[0] https://datatracker.ietf.org/doc/html/rfc5321#section-4.1.1.4
[1] https://datatracker.ietf.org/doc/html/rfc5321#section-4.5.2
[2] https://datatracker.ietf.org/doc/html/rfc2119#section-3
[3] https://datatracker.ietf.org/doc/html/rfc5321#section-3.6.3
[4] https://datatracker.ietf.org/doc/html/rfc5321#section-6.4
[5] https://sec-consult.com/blog/detail/smtp-smuggling-spoofing-e-mails-worldwide/
[6] https://datatracker.ietf.org/doc/html/rfc2822#section-4
On Mon, Apr 29, 2024 at 08:19:52PM GMT, Mark Esler wrote:
> To mitigate future end-of-data sequence attacks, like SMTP
> Smuggling, MTAs should comply with RFC 5321 section 4.1.1.4 [0] to
> strip control characters other than <SP>, <HT>, <CR>, and <LF> in
> the DATA section of SMTP messages.
[...]
> As per RFC 2119 section 3 [2], the word *SHOULD* implies *MUST*
> unless you have a valid reason not to--which is never the case for
> these _forbidden_ sequences in DATA. This is why RFC 5321 4.1.1.4's
> _SHOULD avoid_ implies _needs to strip_.
I don't see that stripping specifically is implied.
> What are the benefits and risks of stripping ASCII NUL and other
> control characters from SMTP DATA?
What is the benefit of stripping versus the much more natural option
of rejecting such messages?
One possible consequence of passing messages along in an altered form
is that various signatures may break.
--
Ian
Hi Mark,
On Mon, Apr 29, 2024 at 08:19:52PM -0500, Mark Esler wrote:
>
> To mitigate future end-of-data sequence attacks, like SMTP Smuggling,
> MTAs should comply with RFC 5321 section 4.1.1.4 [0] to strip control
> characters other than <SP>, <HT>, <CR>, and <LF> in the DATA section
> of SMTP messages.
This is an interesting interpretation of RFC 5321, but I do not think
it follows the contents of said RFC.
> > 4.1.1.4. DATA (DATA)
> >
> > The receiver normally sends a 354 response to DATA, and then treats
> > the lines (strings ending in <CRLF> sequences, as described in
> > Section 2.3.7) following the command as mail data from the sender.
> > This command causes the mail data to be appended to the mail
> > data buffer. The mail data may contain any of the 128 ASCII
> > character codes, although experience has indicated that use
> > of control characters other than SP, HT, CR, and LF may cause
> > problems and SHOULD be avoided when possible.
>
> e.g., `\r\n\x00.\r\n` _SHOULD_ become `\r\n.\r\n` and then (as per
> RFC 5321 section 4.5.2 [1]) dot-stuff the _forbidden_ sequences.
Well, my reading of the RFC does not forbid this sequence. RFC 5321
clearly does not require transforming this sequence into another sequence.
> As per RFC 2119 section 3 [2], the word *SHOULD* implies *MUST*
> unless you have a valid reason not to--which is never the case for
> these _forbidden_ sequences in DATA. This is why RFC 5321 4.1.1.4's
> _SHOULD avoid_ implies _needs to strip_.
RFC 5321 section 4.1.1.4 (DATA (DATA)) states:
"The mail data may contain any of the 128 ASCII character codes"
RFC 5321 section 4.5.2 (Transparency) states:
"The mail data may contain any of the 128 ASCII characters."
One might think that there is some inconsistency with the "SHOULD"
in section 4.1.1.4.
One could also understand the text as allowing any ASCII character
(including NUL), but advising against the use of known problematic ones
(e.g., NUL) by cautious systems.
To put this differently: control characters are _not_ forbidden.
They are _explicitly_ allowed.
> Also note that RFC 5321 section 3.6.3 [3] and section 6.4 [4] do not give
> the OK to send along NUL or other control characters. These sections are
> about _adding_ missing information, not preserving messages with
> potentially damaging garbage.
RFC 5321 section 3.6.3 does not pertain to DATA contents. RFC
5321 section 6.4 mentions the problem of inconsistent handling of
"irregularities", i.e., shall malformed messages be rejected, "repaired",
or delivered as-is insofar possible.
You to seem to advocate for "repair". The "repair" strategy makes Cisco's
ESA vulnerable. I would argue that rejecting messages is less insecure.
> Cheers to Pete Resnick for this clarification and explanation of
> RFC 5321.
>
> This particular issue was first noted in SEC Consult's analysis of
> SMTP Smuggling [5]:
> > During the research we've also discovered some exotic
> > inbound SMTP servers that interpret end-of-data sequences like
> > <CR><LF>\x00.<CR><LF>, with "\x00" representing a null byte. With
> > proprietary SMTP components and lots of different e-mail services
> > intertwined it's hard to tell what is possible until an e-mail
> > reaches its final destination.
> >
> > Even though SMTP smuggling might still be hiding in some places,
> > we hopefully eliminated some big targets.
>
> Stripping NUL and other control characters could have unforeseen
> consequences. MTAs which errantly rely on non-compliant control
> characters would break. Major MTAs are therefore sensibly resistant
> to enforcing RFC 5321 section 4.1.1.4.
Use of control characters is compliant, even though it may be problematic.
> What is the real world HAM:SPAM ratio of emails which include NUL? Would
> it be safe to configure sendmail to `O RejectNUL=True` (which would
> break RFC 2822 section 4 [6] by rejecting email which include NUL)?
>
> What are the benefits and risks of stripping ASCII NUL and other
> control characters from SMTP DATA?
Interesting questions. Perhaps you could perform an experiment and
report on the results?
Perhaps email specifications can be improved to reject known problematic
content elements, e.g., NUL bytes?
Rejecting email for arbitrary reasons is common practice currently.
Rejecting email for containing unwanted characters or character sequences
might thus be acceptable. Rewriting email contents seems to me to be
more problematic, but even that is routinely done nowadays (e.g., to
mark external messages).
> Feedback appreciated,
I would suggest to be rather careful when automatically rewriting
messages in new and unsuspected ways.
> Mark Esler and Bastien Roucari?s
>
> [0] https://datatracker.ietf.org/doc/html/rfc5321#section-4.1.1.4
> [1] https://datatracker.ietf.org/doc/html/rfc5321#section-4.5.2
> [2] https://datatracker.ietf.org/doc/html/rfc2119#section-3
> [3] https://datatracker.ietf.org/doc/html/rfc5321#section-3.6.3
> [4] https://datatracker.ietf.org/doc/html/rfc5321#section-6.4
> [5] https://sec-consult.com/blog/detail/smtp-smuggling-spoofing-e-mails-worldwide/
> [6] https://datatracker.ietf.org/doc/html/rfc2822#section-4
Best regards,
Erik
Mark Esler wrote in
<ZjBHOEHylGAaIo57@moon>:
|To mitigate future end-of-data sequence attacks, like SMTP Smuggling, MTAs
|should comply with RFC 5321 section 4.1.1.4 [0] to strip control
|characters other than <SP>, <HT>, <CR>, and <LF> in the DATA section of
|SMTP messages.
Given that RFC 733 is from 1977 and RFC 822 is from 1982 i feel
this entire thread is exaggerating.
The smuggling problem solely was rooted in the LF / CRLF "wars"
from at minimum the early 70s (Unix and more), with terminal
drivers doing auto-translation on-the-fly etc etc etc.
The internet history list may be worthwhile for this, or examining
the history of Unix programs. Ie, in January i also (funny)
talked to John Klensin on an IETF list saying
[.]The CR/LF "problem" seems to have been "addressed" in
UNIX as early as 1972, ie "6/12/72 STTY (II)" gives
020 map CR into LF; echo LF or CR as LF-CR
...
Mode 020 causes input carriage returns to be turned into new-lines;
input of either CR or LF causes LF-CR both to be echoed
(used for GE TermiNet 300's and other terminals without the
newline function).
In 1974 it became
-nl allow carriage return for new-line,
and output CR-LF for carriage return or new-line
nl accept only new-line to end lines
Which makes me *think* that "Houston, we have a problem" was
ACKnowledged, and in order not to be a crook something would have
been done about it, saving even a byte per line. But i do not
know, this was all military and other high sphere academics by
then. Interesting, by the way, that "so many" expensive decisions
were deemed necessary[.]
--steffen
|
|Der Kragenbaer, The moon bear,
|der holt sich munter he cheerfully and one by one
|einen nach dem anderen runter wa.ks himself off
|(By Robert Gernhardt)
Please let me elaborate a little more on this, not to be
misunderstood and also..
Steffen Nurpmeso wrote in
<20240430224823.uA8Nr1Cp@steffen%sdaoden.eu>:
|Mark Esler wrote in
| <ZjBHOEHylGAaIo57@moon>:
||To mitigate future end-of-data sequence attacks, like SMTP Smuggling, MTAs
||should comply with RFC 5321 section 4.1.1.4 [0] to strip control
||characters other than <SP>, <HT>, <CR>, and <LF> in the DATA section of
||SMTP messages.
|
|Given that RFC 733 is from 1977 and RFC 822 is from 1982 i feel
|this entire thread is exaggerating.
|
|The smuggling problem solely was rooted in the LF / CRLF "wars"
|from at minimum the early 70s (Unix and more), with terminal
|drivers doing auto-translation on-the-fly etc etc etc.
..
|[.]Ie, in January i also (funny)
|talked to John Klensin on an IETF list saying
|
| [.]The CR/LF "problem" seems to have been "addressed" in
| UNIX as early as 1972, ie "6/12/72 STTY (II)" gives
...
| In 1974 it became
...
| -nl allow carriage return for new-line,
| and output CR-LF for carriage return or new-line
| nl accept only new-line to end lines
...
..because two drafts on character set cramping circulate in the
IETF (of which i am not a representative (member), just like i do
not use airplanes in my adult life, eat meat etcetc). For one
there is draft-bormann-dispatch-modern-network-unicode, and there
is another one by the writer of the (uh! horror!) JSON RFC 8259.
I myself oppose any such cramping in general, and do not
understand their usefulness. I said, yes, if you cat(1) such
a file to a UNIX terminal [..you can think the rest..]. (There is
btw "cat -vet", which i do not deem harmful in that sense.) In
general Unicode also has the "SYMBOL FOR [ascii-control]" range to
visualize controls, at U+2400 ff., it seems is not widely known.
To make it short, if you have some protocol or media type
somewhere, you define its semantics, and those can be whatever is
needed or desired. For example if you mutilate somewhat user
friendly SGML to XML, you can restrict the available code points:
they did, and so you cannot represent the entire possible range of
filenames of neither POSIX nor Windows with XML.
But, i said, a *general* restriction of the range of code points,
what should this be good for. For example SMTP can transport all
these for many decades, and it works a billion times each day;
today mostly MIME content-transfer-encoding is applied though, and
Unicode aware applications either show those "this-font-has-that-
codepoint-NOT" boxes (no normal user gets that btw), or the symbol
as such, *even for control codes*, as above, for example U+2400
for NUL (Unicode NULL).
Yes, i mean, a program must deal with it, the one way or the
other. Has someone tested how many XML programs for example
adhere to the standardized range of allowed code points.
Bugs and logical errors also exist everywhere, the rust of it.
Etc etc. A control character is nothing special.
See, with JSON, for example, you get surrogates encoded as UTF-8
which is a totally forbidden thing according to Unicode, and any
poor conforming UTF-8 decoder has to deal with that mess if really
JSON has to be used for K=V\0K=V\0\0 value lists.
Anyway, there was a pretty good discussion in October last year
(on art AT IETF), and to my suprise again in March, somehow not
taking into account many items raised in last October.
Any i very much liked the actual IETF session for which dozens of
people flew hundreds of thousands of kilometres to Brisbane in
Australia, and Rob Sayre of i think Mozilla thankfully posted an
URL to the correct timeline position
https://youtu.be/bPNRO2HYITg?si=zyWwK26TqYel7mRT&t=6684
I very much agree with all the people in the crowd ("define
a profile!"), and i loved the one who said something like
"presentation is fantastic, conclusions are not".
No people, that not.
While (presumably) here, i also dislike that demonization of SUID
programs currently on the table again. Is it really so much safer
and better to have some program talk to an all-capable
super-daemon via IPC, which then starts another program with the
desired "environment", instead of having a program like "super"
(which somehow disappeared somewhen after Y2K?), "sudo" or "doas"
(what i now use because my needs are very small minded, and i use
scripts which do the real work, for example
$ sed '/^$/d;/^#/d' < /etc/doas.conf | wc -l
14
..
permit nopass nolog keepenv setenv { -SSH_AGENT_PID -SSH_AUTH_SOCK } :shared as root cmd /x/pub/box-web.sh
->
[ $(id -u) -ne 0 ] && exec ${SUPER} /x/pub/box-web.sh "${@}"
runit "${@}"
->
boxit ${action} </dev/null >/dev/null 2>&1 &
->
cd /
ip netns exec ${netns} \
/usr/bin/env -i AUTHDISPLAY=${AUTHDISPLAY} DISPLAY=${DISPLAY} TERM=${TERM} XAUTHORITY=${XAUTHORITY} \
/usr/bin/unshare --ipc --uts --pid --fork --mount --mount-proc ${kill_child} ${rooter} ${prog} &
pid=${!}
[ -d /sys/fs/cgroup/_box_web ] && printf '%s\n' ${pid} > /sys/fs/cgroup/_box_web/cgroup.procs
A bit racy, unfortunately. I mean, if you *design* a SUID
program, and it sets up things (clearing environment, closing
FDs..), and then looses privileges, how is that worse? Where is
the attack surface? Parsing the configuration file maybe, and you
*could* outsource that into a dedicated subprocess and talk with
that via IPC.
I need ping sometimes, i need video and audio access, and "my"
user is in the necessary groups; whether some super-server allows
some other non-SUID program to access those via some configuration
file somewhere, or whether normal (searchable) UNIX
user/group[/capability] credentials are used to control access to
a carefully designed and audited SET[GU]ID binary that creates
a ping socket / opens an audio device and ioctl-inits it etc etc,
before it once and for all drops those privileges.
I think it is unfair to compare programs which have decades of
history, which where developed in a software world where maybe
OpenSSL already existed, like surely malicious actors, too, but in
a completely different mental set without the experience of mass
surveillance and mass exploits etc etc. If you want to demonize,
demonize that, not programs like sudo; not to mention that in
order to support the configurability friendliness of sudo, any
other implementation had to go a long road.
--steffen
|
|Der Kragenbaer, The moon bear,
|der holt sich munter he cheerfully and one by one
|einen nach dem anderen runter wa.ks himself off
|(By Robert Gernhardt)
Steffen,
On Thu, May 02, 2024 at 08:21:46PM +0200, Steffen Nurpmeso wrote:
> Please let me elaborate a little more on this, not to be
> misunderstood and also..
This reads like an excuse to post lots of thoughts that are off-topic
for this thread. I understand that sometimes discussions wander off the
original topic, but in this case the second half of your message is
totally irrelevant. I approved this one message anyway out of respect
for the time you spent writing it, but please be aware that I am
unlikely to do that next time you do something like this. I also ask
others to please refrain from following up on parts irrelevant to this
thread's original topic. Another reason I post this reply publicly is
not to give the impression that it's OK to hijack threads here.
Alexander
On 4/30/24 14:13, [email protected] wrote:
> What is the benefit of stripping versus the much more natural option
> of rejecting such messages?
This applies to <CRLF>.<CRLF> and RFC 5321 section 4.5.2 as well.
Postfix' CVE-2023-51764 patchset adds options to normalize (default),
note, reject, or ignore bare newlines:
```
cleanup_replace_stray_cr_lf = default:yes
smtpd_forbid_bare_newline = default:normalize
smtpd_forbid_bare_newline_exclusions = default:$mynetworks
```
(See Postfix' HISTORY file for more context.)
To get back to your question, these all have usecases. What should be the
default is described in RFC 5321. Offering options and using the RFCs sane
default seems like a fair balance.
On 4/30/24 14:42, Erik Auerswald wrote:
> Well, my reading of the RFC does not forbid this sequence. RFC 5321
> clearly does not require transforming this sequence into another sequence.
I should have initially clarified that _"forbidden" pattern_ refers to RFC
5321 section 4.5.2:
Without some provision for data transparency, the character sequence
"<CRLF>.<CRLF>" ends the mail text and cannot be sent by the user.
In general, users are not aware of such "forbidden" sequences. To
allow all user composed text to be transmitted transparently, the
following procedures are used:
o Before sending a line of mail text, the SMTP client checks the
first character of the line. If it is a period, one additional
period is inserted at the beginning of the line.
o When a line of mail text is received by the SMTP server, it checks
the line. If the line is composed of a single period, it is
treated as the end of mail indicator. If the first character is a
period and there are other characters on the line, the first
character is deleted.
I believe "<CRLF><NUL>.<CRLF>" is a forbidden sequence. Resnick's response
to this sequence is:
"Yes, the sending MTA should strip the NUL (as per RFC 5321 section
4.1.1.4) and then dot-stuff the dot on the line by itself (as per RFC 5321
section 4.5.2)."
> You to seem to advocate for "repair". The "repair" strategy makes Cisco's
> ESA vulnerable. I would argue that rejecting messages is less insecure.
By default, Cisco ESA is not RFC compliant. Their "clean" option only
replaces bare <CR> and <LF> characters with <CRLF>. So that <CR>.<CR>
becomes <CRLF>.<CRLF>. Both of these are "forbidden" patterns and should
be dot stuffed.
On 4/30/24 17:48, Steffen Nurpmeso wrote:
> Given that RFC 733 is from 1977 and RFC 822 is from 1982 i feel
> this entire thread is exaggerating.
The history of EOD attacks makes the recent SMTP Smuggling attacks so
surprising! It is hard to believe that SMTP servers were recently vulnerable
to <CRLF>.<CRLF> variations and that others still are.
Thanks Solar :)
Mark Esler and Bastien Roucari?s
Hi,
On Wed, May 08, 2024 at 10:01:16AM -0500, Mark Esler wrote:
> [...]
> On 4/30/24 14:42, Erik Auerswald wrote:
> >
> > Well, my reading of the RFC does not forbid this sequence. RFC 5321
> > clearly does not require transforming this sequence into another
> > sequence.
>
> I should have initially clarified that _"forbidden" pattern_ refers
> to RFC 5321 section 4.5.2:
>
> Without some provision for data transparency, the character sequence
> "<CRLF>.<CRLF>" ends the mail text and cannot be sent by the user.
> In general, users are not aware of such "forbidden" sequences. To
> allow all user composed text to be transmitted transparently, the
> following procedures are used:
>
> o Before sending a line of mail text, the SMTP client checks the
> first character of the line. If it is a period, one additional
> period is inserted at the beginning of the line.
>
> o When a line of mail text is received by the SMTP server, it checks
> the line. If the line is composed of a single period, it is
> treated as the end of mail indicator. If the first character is a
> period and there are other characters on the line, the first
> character is deleted.
>
> I believe "<CRLF><NUL>.<CRLF>" is a forbidden sequence. Resnick's
> response to this sequence is:
> "Yes, the sending MTA should strip the NUL (as per RFC 5321 section
> 4.1.1.4) and then dot-stuff the dot on the line by itself (as per
> RFC 5321 section 4.5.2)."
This section of the RFC explicitly states that any ASCII character is
allowed (see the first sentence you omitted from your quote). Any ASCII
character includes NUL. Stripping the NUL violates the standard.
This is obvious. The RFC text is clear.
> > You to seem to advocate for "repair". The "repair" strategy makes
> > Cisco's ESA vulnerable. I would argue that rejecting messages is
> > less insecure.
>
> By default, Cisco ESA is not RFC compliant.
The Cisco ESA has been called out in the original SMTP smuggling report
as facilitating SMTP smuggling attacks, thus it is useful as an example.
It provides an example where a side-effect of rewriting email content is
vulnerability to "SMTP smuggling". The important part is that rewriting
email content might have unintended side-effects, independent of RFC
compliance.
> Their "clean" option only replaces bare <CR> and <LF> characters with
> <CRLF>. So that <CR>.<CR> becomes <CRLF>.<CRLF>. Both of these are
> "forbidden" patterns and should be dot stuffed.
The one and only "forbidden" sequence is "<CRLF>.<CRLF>".
The one and only sequence to be "dot stuffed" is "<CRLF>.<CRLF>".
I think that "forbidden" is a bad choice of words. It does not mean that
some authority has "forbidden" sending this sequence inside an email,
but that SMTP needs a way to transparently transport the "forbidden"
sequence for the user who is explicitly *allowed* to send it:
"To allow all user composed text to be transmitted transparently"
"The mail data may contain any of the 128 ASCII characters."
Best regards,
Erik