FYI: the filesystem hosting kanga.kvack.org which hosts [email protected]
and a few other assorted things was damaged around 9:17am. The mailing
list is back up and running from a March 2nd backup for now. The problem
is either a bad SSD or a btrfs bug, however no crash dump was captured to
help with debugging. New hardware will be deployed this afternoon after
an attempt at data recovery is made.
-ben
--
"Thought is the essence of where you are now."
On Tue, Jun 22, 2021 at 10:59:54AM -0400, Benjamin LaHaise wrote:
> FYI: the filesystem hosting kanga.kvack.org which hosts [email protected]
> and a few other assorted things was damaged around 9:17am. The mailing
> list is back up and running from a March 2nd backup for now. The problem
> is either a bad SSD or a btrfs bug, however no crash dump was captured to
> help with debugging.
Do you have any logs? Also when you have a suspicion that it was caused
by hardware, what's the SSD type? If you'd like a more interactive
discussion please come to the libera.chat #btrfs channel, there are
people with a lot of experience and knowledge about buggy hardware. In
case it's a btrfs bug we'd be interested to know at least something that
could help to narrow it down.
d.
On Tue, 2021-06-22 at 10:59 -0400, Benjamin LaHaise wrote:
> FYI: the filesystem hosting kanga.kvack.org which hosts
> [email protected]
> and a few other assorted things was damaged around 9:17am. The
> mailing list is back up and running from a March 2nd backup for
> now. The problem is either a bad SSD or a btrfs bug, however no
> crash dump was captured to help with debugging. New hardware will be
> deployed this afternoon after an attempt at data recovery is made.
Perhaps it's time to move this list over to vger or the linux.dev
infrastructure now that it's being brought up? We already migrated the
containers list without too much pain.
Regards,
James
On Fri, Jun 25, 2021 at 10:00:15AM -0700, James Bottomley wrote:
> Perhaps it's time to move this list over to vger or the linux.dev
> infrastructure now that it's being brought up? We already migrated the
> containers list without too much pain.
Maybe the btrfs bugs should get fixed.
-ben
--
"Thought is the essence of where you are now."
On Fri, 2021-06-25 at 13:12 -0400, Benjamin LaHaise wrote:
> On Fri, Jun 25, 2021 at 10:00:15AM -0700, James Bottomley wrote:
> > Perhaps it's time to move this list over to vger or the linux.dev
> > infrastructure now that it's being brought up? We already migrated
> > the containers list without too much pain.
>
> Maybe the btrfs bugs should get fixed.
I believe we can do both.
James
On Fri, Jun 25, 2021 at 12:21:24PM -0700, James Bottomley wrote:
> On Fri, 2021-06-25 at 13:12 -0400, Benjamin LaHaise wrote:
> > On Fri, Jun 25, 2021 at 10:00:15AM -0700, James Bottomley wrote:
> > > Perhaps it's time to move this list over to vger or the linux.dev
> > > infrastructure now that it's being brought up? We already migrated
> > > the containers list without too much pain.
> >
> > Maybe the btrfs bugs should get fixed.
>
> I believe we can do both.
If I were unresponsive at fixing issues, I would understand the need to
migrate services, but steps to address the failures have already been
taken and additional mitigations are planned. If we migrated services
every time a piece of hardware failed or we hit a kernel bug, then we
wouldn't have any infrastructure left.
-ben
--
"Thought is the essence of where you are now."
On Fri, 2021-06-25 at 15:26 -0400, Benjamin LaHaise wrote:
> On Fri, Jun 25, 2021 at 12:21:24PM -0700, James Bottomley wrote:
> > On Fri, 2021-06-25 at 13:12 -0400, Benjamin LaHaise wrote:
> > > On Fri, Jun 25, 2021 at 10:00:15AM -0700, James Bottomley wrote:
> > > > Perhaps it's time to move this list over to vger or the
> > > > linux.dev infrastructure now that it's being brought up? We
> > > > already migrated the containers list without too much pain.
> > >
> > > Maybe the btrfs bugs should get fixed.
> >
> > I believe we can do both.
>
> If I were unresponsive at fixing issues, I would understand the need
> to migrate services, but steps to address the failures have already
> been taken and additional mitigations are planned. If we migrated
> services every time a piece of hardware failed or we hit a kernel
> bug, then we wouldn't have any infrastructure left.
It's not about response time, it's about the fact that we finally got
kerne.org funded via the LF to pay for someone to run our mailing list
infrastructure so we no longer have to do it ourselves. We've already
transferred the containers mailman list and vger is going to be
migrated to it soon. The new infrastructure comes with HA and a whole
host of backend CDN data centres in various geographies and public
inbox backing, so it should be quite slick.
James
On Fri, Jun 25, 2021 at 03:26:07PM -0400, Benjamin LaHaise wrote:
> On Fri, Jun 25, 2021 at 12:21:24PM -0700, James Bottomley wrote:
> > On Fri, 2021-06-25 at 13:12 -0400, Benjamin LaHaise wrote:
> > > On Fri, Jun 25, 2021 at 10:00:15AM -0700, James Bottomley wrote:
> > > > Perhaps it's time to move this list over to vger or the linux.dev
> > > > infrastructure now that it's being brought up? We already migrated
> > > > the containers list without too much pain.
> > >
> > > Maybe the btrfs bugs should get fixed.
> >
> > I believe we can do both.
>
> If I were unresponsive at fixing issues, I would understand the need to
> migrate services,
Well, the DKIM issue has been left unresolved for a long time.
I saw on the bug conversation there seems to be no clear path to fix
it?
The LF/vger lists don't have this problem. The amount of email
impacted via recipient spam filtering seems to be increasing every
month, and tools like b4 don't work as intended.
It is not some minor complaint.
Jason
On Mon, Jun 28, 2021 at 10:46:07AM -0300, Jason Gunthorpe wrote:
> Well, the DKIM issue has been left unresolved for a long time.
> I saw on the bug conversation there seems to be no clear path to fix
> it?
Nobody had ever bothered to provide or figure out a test case. We have
one now and I'm working with Tucows to try to sort out a fix.
The fact of the matter is that the DKIM spec is broken and doesn't
properly address issues relating to transport of emails containing UTF-8
content over SMTP sessions which are limited to 7 bit transport due to
backwards compatibility assumptions.
-ben
--
"Thought is the essence of where you are now."
On Mon, Jun 28, 2021 at 09:53:52AM -0400, Benjamin LaHaise wrote:
> The fact of the matter is that the DKIM spec is broken and doesn't
> properly address issues relating to transport of emails containing UTF-8
> content over SMTP sessions which are limited to 7 bit transport due to
> backwards compatibility assumptions.
Isn't a 7-bit conversion what I pointed at last time we talked about
this?
DKIM assumes a "modern" mail system, there should not be 7bit
conversions in the mail pipeline. Anyone sending DKIM needs to be 8
bit clean.
Jason
On Mon, Jun 28, 2021 at 10:40:51AM -0400, Benjamin LaHaise wrote:
> On Mon, Jun 28, 2021 at 11:26:59AM -0300, Jason Gunthorpe wrote:
> > Isn't a 7-bit conversion what I pointed at last time we talked about
> > this?
>
> I changed several options in postfix last time this was raised, but as
> nobody ever provided a test case, I had no way of knowing if it worked or
> not.
I've been using a script like this against the lore public inbox git
repos to monitor my own domain's dkim cleanness and interaction with
list serves:
#!/usr/bin/python3
import subprocess
import collections
# Starting points
start = XXXXX # git commit id string
emails = collections.defaultdict(list)
commits = subprocess.check_output(["git","log","master","^" + start,'--pretty=format:%H %aN <%aE>']).decode()
for ln in commits.splitlines():
commit,_,email = ln.partition(' ')
if "nvidia.com" in email.lower():
emails[email].append(commit)
fails = set()
not_empty = True;
while not_empty:
not_empty = False;
for email,commits in sorted(emails.items()):
if email in fails or not commits:
continue
commit = commits[-1];
del commits[-1]
if commits:
not_empty = True;
msg = subprocess.check_output(["git","show",commit + ":m"]);
try:
subprocess.check_output(["dkimverify"], input=msg);
#print(email)
except:
fails.add(email)
print("Failed!", email, commit)
It has taken a lot of doing, but nvidia.com is now effectively DKIM
clean through vger.
You could run with with some known-good domains like nvidia.com,
facebook.com, google.com, to measure kvack's activity. Failures can
often be cross-correlated against a vger list and then you can do A/B
comparison to guess what is wrong.
> spec that ignores decades of that philosophy at the IETF. And even if a
> DKIM signature passes, that's still not enough to trust the resulting
> email. All it does is ensure that a small subset of valid emails get
> dropped on the floor. This doesn't seem like an overall win.
I have no idea. It is here, people beyond us have made this decision,
we have to work within it. DMARC is ratcheting this up and is moving
to say if DKIM fails then emails should be discared.
Jason
On Mon, Jun 28, 2021 at 11:26:59AM -0300, Jason Gunthorpe wrote:
> Isn't a 7-bit conversion what I pointed at last time we talked about
> this?
I changed several options in postfix last time this was raised, but as
nobody ever provided a test case, I had no way of knowing if it worked or
not. Personally, I think DKIM provides very little value considering that
a good chunk of the spam that goes by has valid DKIM signatures, not to
mention that it doesn't help with modern phishing attempts much either.
> DKIM assumes a "modern" mail system, there should not be 7bit
> conversions in the mail pipeline. Anyone sending DKIM needs to be 8
> bit clean.
"Be strict in what you send, and be liberal in what you receive." DKIM
makes assumptions about the mail transport layer that are not true. If
the signatures had been applied on content *after* the quoted printable
conversion, this would never have been an issue. DKIM is a poorly done
spec that ignores decades of that philosophy at the IETF. And even if a
DKIM signature passes, that's still not enough to trust the resulting
email. All it does is ensure that a small subset of valid emails get
dropped on the floor. This doesn't seem like an overall win.
-ben
--
"Thought is the essence of where you are now."