Haraka is throwing errors when the client sends QUIT too early #1410

dstibrany · 2016-04-08T22:21:00Z

Haraka version

2.7.2

Observed behaviour

I occasionally see clients send 'QUIT' commands before we send '250 Message Queued' replies.
In those situations Haraka throws:

2016-04-08T13:54:53.00749 [ERROR] [EF24362D-72DC-4DBB-8B60-00DAD4A12BEF.1] [core] cmd_quit failed: Error: We are already running hooks! Fatal error! 2016-04-08T13:54:53.00778 [ERROR] [-] [core] undefinedError: We are already running hooks! Fatal error!0[ 'Error: We are already running hooks! Fatal error!', 2016-04-08T13:54:53.00779 ' at Object.plugins.run_hooks (/usr/local/lib/node_modules/Haraka/plugins.js:281:15)', 2016-04-08T13:54:53.00780 ' at Connection.cmd_quit (/usr/local/lib/node_modules/Haraka/connection.js:1193:13)', 2016-04-08T13:54:53.00781 ' at Connection.process_line (/usr/local/lib/node_modules/Haraka/connection.js:280:29)', 2016-04-08T13:54:53.00782 ' at Connection._process_data (/usr/local/lib/node_modules/Haraka/connection.js:423:18)', 2016-04-08T13:54:53.00783 ' at /usr/local/lib/node_modules/Haraka/connection.js:632:41', 2016-04-08T13:54:53.00783 ' at process._tickDomainCallback (node.js:486:13)' ]

Expected behaviour

Any ideas why the client is behaving this way?
Any way to handle this more cleanly? In particular, I'd prefer to have the option to DENY these, because with 421 replies I keep getting this message sent (and the subsequent error) over and over.

Steps to reproduce

Here is the last bit of data from the client before the error occurs:

...
2016-04-08T13:54:53.00168 [DATA] [EF24362D-72DC-4DBB-8B60-00DAD4A12BEF.1] [core] C: 0px; font-size: 0px; overflow: hidden; mso-hide: all"/></body>\r
2016-04-08T13:54:53.00181 [DATA] [EF24362D-72DC-4DBB-8B60-00DAD4A12BEF.1] [core] C: </html>\r
2016-04-08T13:54:53.00193 [DATA] [EF24362D-72DC-4DBB-8B60-00DAD4A12BEF.1] [core] C: <!-- /Wrapper -->\r
2016-04-08T13:54:53.00206 [DATA] [EF24362D-72DC-4DBB-8B60-00DAD4A12BEF.1] [core] C: ------=_NextPart_000_0186_01D1911A.E9F01E60--\r
2016-04-08T13:54:53.00220 [DATA] [EF24362D-72DC-4DBB-8B60-00DAD4A12BEF.1] [core] C: \r
2016-04-08T13:54:53.00233 [DATA] [EF24362D-72DC-4DBB-8B60-00DAD4A12BEF.1] [core] C: .\r
2016-04-08T13:54:53.00307 [DEBUG] [EF24362D-72DC-4DBB-8B60-00DAD4A12BEF.1] [core] running data_post hooks
2016-04-08T13:54:53.00319 [DEBUG] [EF24362D-72DC-4DBB-8B60-00DAD4A12BEF.1] [core] running data_post hook in hdmailer plugin
2016-04-08T13:54:53.00646 [PROTOCOL] [EF24362D-72DC-4DBB-8B60-00DAD4A12BEF.1] [core] C: QUIT state=1
2016-04-08T13:54:53.00656 [DEBUG] [EF24362D-72DC-4DBB-8B60-00DAD4A12BEF.1] [core] running quit hooks
2016-04-08T13:54:53.00749 [ERROR] [EF24362D-72DC-4DBB-8B60-00DAD4A12BEF.1] [core] cmd_quit failed: Error: We are already running hooks! Fatal error!

Thanks, let me know if there's any more info I can provide.

Dave

The text was updated successfully, but these errors were encountered:

baudehlo · 2016-04-08T22:37:11Z

Hi Dave, nice to hear from you!

(for other Haraka devs reading this - I used to work at the company using
this, and wrote the original plugin here)

Knowing a bit about what goes on in hdmailer, it's related to handling
attachments and pausing/resuming is my guess.

We wouldn't process the QUIT from the client if the connection was paused.
At least I hope not - that code hasn't broken for my uses of it.

I thought I had put enough code into hdmailer to support this - I guess not.

You can see what is required by looking into the attachment plugin:
https://github.com/haraka/Haraka/blob/master/plugins/attachment.js - look
for where we call pause and resume. Check against those places in hdmailer
and see if you're missing something. The basic principal is that you're
calling resume() too soon. Add logging to the plugin to see if you're
calling resume before the processing of attachments ends.

Alternatively just contact me personally (you have my email) and I'll take
a poke at the code - it might be a straightforward fix.

As a side note, I'd love to improve all of this hackery. But it's a big
back-burner ticket in Haraka. Unfortunately due to attachments using
streams, we have yet to figure out a way to make this transparent to
plugins. Maybe for 3.0.

On Fri, Apr 8, 2016 at 6:21 PM, Dave Stibrany [email protected]
wrote:

Haraka version

2.7.2
Observed behaviour

I occasionally see clients send 'QUIT' commands before we send '250
Message Queued' replies.
In those situations Haraka throws:

2016-04-08T13:54:53.00749 [ERROR] [EF24362D-72DC-4DBB-8B60-00DAD4A12BEF.1]
[core] cmd_quit failed: Error: We are already running hooks! Fatal error!
2016-04-08T13:54:53.00778 [ERROR] [-] [core] undefinedError: We are
already running hooks! Fatal error!0[ 'Error: We are already running hooks!
Fatal error!',
2016-04-08T13:54:53.00779 ' at Object.plugins.run_hooks
(/usr/local/lib/node_modules/Haraka/plugins.js:281:15)',
2016-04-08T13:54:53.00780 ' at Connection.cmd_quit
(/usr/local/lib/node_modules/Haraka/connection.js:1193:13)',
2016-04-08T13:54:53.00781 ' at Connection.process_line
(/usr/local/lib/node_modules/Haraka/connection.js:280:29)',
2016-04-08T13:54:53.00782 ' at Connection._process_data
(/usr/local/lib/node_modules/Haraka/connection.js:423:18)',
2016-04-08T13:54:53.00783 ' at
/usr/local/lib/node_modules/Haraka/connection.js:632:41',
2016-04-08T13:54:53.00783 ' at process._tickDomainCallback
(node.js:486:13)' ]
Expected behaviour

Any ideas why the client is behaving this way?

Any way to handle this more cleanly? In particular, I'd prefer to have
the option to DENY these, because with 421 replies I keep getting this
message sent (and the subsequent error) over and over.
Steps to reproduce

Here is the last bit of data from the client before the error occurs:

...
2016-04-08T13:54:53.00168 [DATA] [EF24362D-72DC-4DBB-8B60-00DAD4A12BEF.1] [core] C: 0px; font-size: 0px; overflow: hidden; mso-hide: all"/>\r
2016-04-08T13:54:53.00181 [DATA] [EF24362D-72DC-4DBB-8B60-00DAD4A12BEF.1] [core] C: \r
2016-04-08T13:54:53.00193 [DATA] [EF24362D-72DC-4DBB-8B60-00DAD4A12BEF.1] [core] C: \r
2016-04-08T13:54:53.00206 [DATA] [EF24362D-72DC-4DBB-8B60-00DAD4A12BEF.1] [core] C: ------=_NextPart_000_0186_01D1911A.E9F01E60--\r
2016-04-08T13:54:53.00220 [DATA] [EF24362D-72DC-4DBB-8B60-00DAD4A12BEF.1] [core] C: \r
2016-04-08T13:54:53.00233 [DATA] [EF24362D-72DC-4DBB-8B60-00DAD4A12BEF.1] [core] C: .\r
2016-04-08T13:54:53.00307 [DEBUG] [EF24362D-72DC-4DBB-8B60-00DAD4A12BEF.1] [core] running data_post hooks
2016-04-08T13:54:53.00319 [DEBUG] [EF24362D-72DC-4DBB-8B60-00DAD4A12BEF.1] [core] running data_post hook in hdmailer plugin
2016-04-08T13:54:53.00646 [PROTOCOL] [EF24362D-72DC-4DBB-8B60-00DAD4A12BEF.1] [core] C: QUIT state=1
2016-04-08T13:54:53.00656 [DEBUG] [EF24362D-72DC-4DBB-8B60-00DAD4A12BEF.1] [core] running quit hooks
2016-04-08T13:54:53.00749 [ERROR] [EF24362D-72DC-4DBB-8B60-00DAD4A12BEF.1] [core] cmd_quit failed: Error: We are already running hooks! Fatal error!

Thanks, let me know if there's any more info I can provide.

Dave

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#1410

dstibrany · 2016-04-11T14:37:22Z

Hey Matt! Great to hear from you as well.

I had to end up going in and making some changes to the plugin, because we now support html emails. So we need to try and decide if an attachment is a true attachment or if it's part of an html email and making that decision turns out to be pretty hard.

Anyways, looks like this bug was introduced when I added that functionality. I'll try adding some more logging around pause/resume as you suggested.

Thanks!

Dave

baudehlo · 2016-04-11T15:09:23Z

You could probably just ignore attachments with a content-id header, no?

On Mon, Apr 11, 2016 at 10:37 AM, Dave Stibrany [email protected]
wrote:

Closed #1410 #1410.

—
You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#1410 (comment)

dstibrany · 2016-04-11T21:01:04Z

I think the problem was that some legitimate attachments still have a content-id, like when somebody drags an image into a gmail compose window, gmail treats it like an inline attachment.

smfreegard · 2016-04-11T21:03:26Z

Isn't Content-Disposition more appropriate here. e.g. Content-Disposition: attachment ...

dstibrany · 2016-04-11T21:07:45Z

A lot of clients seem to mark them as 'inline', iPhone's mail app for example.

baudehlo · 2016-04-11T21:09:30Z

Not all clients add the content disposition either. Isn't email fun :)

On Apr 11, 2016, at 5:03 PM, Steve Freegard [email protected] wrote:

Isn't Content-Disposition more appropriate here. e.g. Content-Disposition: attachment ...

—
You are receiving this because you commented.
Reply to this email directly or view it on GitHub

dstibrany · 2016-04-11T21:17:52Z

Processing attachments correctly has been crazy. I had to write some heuristic to make a decision, and in cases where we're not sure, get both the html body and the attachments. That way we're at least somewhat guaranteed not to miss anything.

baudehlo · 2016-04-11T22:01:47Z

You'll also want to upgrade to 2.8 as soon as you can because it handles the headers/file names more correctly.

On Apr 11, 2016, at 5:17 PM, Dave Stibrany [email protected] wrote:

Processing attachments correctly has been crazy. I had to write some heuristic to make a decision, and in cases where we're not sure, get both the html body and the attachments. That way we're at least somewhat guaranteed not to miss anything.

—
You are receiving this because you commented.
Reply to this email directly or view it on GitHub

dstibrany · 2016-04-12T16:46:26Z

Nice, thanks!

dstibrany · 2017-03-21T15:14:59Z

Hey Matt,

We're having this issue pop up again. We end up reading the client's QUIT too early and then throw errors. I'm able to reproduce this locally against the stock "attachment.js" plugin via netcat:

$ cat broken.txt | nc -i 1 localhost 25

See the attached bad.email file. I'm pretending that the file is a zip so that it goes through the unarchive process and also note that I needed to add a \r\n after the final dot to get it to work in netcat.
broken.txt

Do you have an suggestions on how to work around this? This issue only happens with 2 or 3 clients, but happens consistently for those clients.

Some additional info:
OS: Ubuntu 14.04
Node version: v4.6.0
Haraka version: 2.8.12

baudehlo · 2017-03-21T15:42:51Z

There's no gap between the headers and the body. Is that intentional?

…

On Tue, Mar 21, 2017 at 11:15 AM, Dave Stibrany ***@***.***> wrote: Hey Matt, We're having this issue pop up again. We end up reading the client's QUIT too early and then throw errors. I'm able to reproduce this locally against the stock "attachment.js" plugin via netcat: $ cat broken.txt | nc -i 1 localhost 25 See the attached bad.email file. I'm pretending that the file is a zip so that it goes through the unarchive process and also note that I needed to add a \r\n after the final dot to get it to work in netcat. broken.txt <https://github.com/haraka/Haraka/files/858574/broken.txt> Do you have an suggestions on how to work around this? This issue only happens with 2 or 3 clients, but happens consistently for those clients. Some additional info: OS: Ubuntu 14.04 Node version: v4.6.0 Haraka version: 2.8.12 — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1410 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAobY5aTCbUcrxt01yQ-XLOCbOlcIUbKks5rn-l0gaJpZM4IDbym> .

dstibrany · 2017-03-21T16:29:22Z

Damn, the no gap was unintentional. With the gap the test case now totally passes both on the stock attachment.js and our own plugin. I'll see if I can find another test case that breaks our app.

I pushed a change to how we do pause/resume of the stream, so I'll see if that makes this go away.

baudehlo · 2017-03-21T17:00:59Z

In fairness it shouldn't crash Haraka, even with no gap. Is it doing that?

…

On Tue, Mar 21, 2017 at 12:29 PM, Dave Stibrany ***@***.***> wrote: Damn, the no gap was unintentional. With the gap the test case now totally passes both on the stock attachment.js and our own plugin. I'll see if I can find another test case that breaks our app. I pushed a change to how we do pause/resume of the stream, so I'll see if that makes this go away. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1410 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAobY9Rfy2PBHgjeqoxPtC5F-jrC16J6ks5rn_rjgaJpZM4IDbym> .

dstibrany · 2017-03-21T17:28:36Z

Not a hard crash, just lots of errors:

[ERROR] [BA8E40D0-3D19-4233-8C54-386388B00007.1] [core] cmd_quit failed: Error: We are already running hooks! Fatal error!
[ERROR] [-] [core] undefinedError: We are already running hooks! Fatal error!0[ 'Error: We are already running hooks! Fatal error!',
  '    at Object.plugins.run_hooks (/home/vagrant/.nvm/versions/node/v4.6.0/lib/node_modules/Haraka/plugins.js:396:15)',
  '    at Connection.cmd_quit (/home/vagrant/.nvm/versions/node/v4.6.0/lib/node_modules/Haraka/connection.js:1279:13)',
  '    at Connection.process_line (/home/vagrant/.nvm/versions/node/v4.6.0/lib/node_modules/Haraka/connection.js:319:29)',
  '    at Connection._process_data (/home/vagrant/.nvm/versions/node/v4.6.0/lib/node_modules/Haraka/connection.js:464:18)',
  '    at Immediate._onImmediate (/home/vagrant/.nvm/versions/node/v4.6.0/lib/node_modules/Haraka/connection.js:680:37)',
  '    at processImmediate [as _immediateCallback] (timers.js:383:17)' ]

and then we send a 421 Internal Server Error

baudehlo · 2017-03-21T17:35:18Z

OK, but that error also shouldn't occur. I'll have a look anyway.

…

On Tue, Mar 21, 2017 at 1:28 PM, Dave Stibrany ***@***.***> wrote: Not a hard crash, just lots of errors: [ERROR] [BA8E40D0-3D19-4233-8C54-386388B00007.1] [core] cmd_quit failed: Error: We are already running hooks! Fatal error! [ERROR] [-] [core] undefinedError: We are already running hooks! Fatal error!0[ 'Error: We are already running hooks! Fatal error!', ' at Object.plugins.run_hooks (/home/vagrant/.nvm/versions/node/v4.6.0/lib/node_modules/Haraka/plugins.js:396:15)', ' at Connection.cmd_quit (/home/vagrant/.nvm/versions/node/v4.6.0/lib/node_modules/Haraka/connection.js:1279:13)', ' at Connection.process_line (/home/vagrant/.nvm/versions/node/v4.6.0/lib/node_modules/Haraka/connection.js:319:29)', ' at Connection._process_data (/home/vagrant/.nvm/versions/node/v4.6.0/lib/node_modules/Haraka/connection.js:464:18)', ' at Immediate._onImmediate (/home/vagrant/.nvm/versions/node/v4.6.0/lib/node_modules/Haraka/connection.js:680:37)', ' at processImmediate [as _immediateCallback] (timers.js:383:17)' ] and then we send a 421 Internal Server Error — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1410 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAobY9lp0wFNP8qa7ijJPLKzIPIlziPlks5roAjFgaJpZM4IDbym> .

dstibrany · 2017-03-21T17:39:18Z

Awesome, thanks Matt.

dstibrany · 2017-03-24T03:22:14Z

Hey, I've managed to create another minimal test case that breaks both our app and the attachment plugin. The key was having an attachment size >= 16384 bytes, and sending a QUIT right after the final dot. If I reduce the attachment size down to 16383, everything works fine.

$ cat min_test_case.txt | nc -i 1 localhost 25

min_test_case.txt

dstibrany · 2017-04-10T14:25:29Z

Bump on this, I think it might be a legitimate issue and am wondering if you guys can confirm. Running the above test case against the stock attachment plugin should reproduce it.

smfreegard · 2017-04-10T14:36:35Z

@dstibrany - looks like an easy fix. Can you change this line https://github.com/haraka/Haraka/blob/master/plugins.js#L402

to

    if (!/^(reset_transaction|disconnect|quit|deny|log)$/.test(hook) &&

And let me know that it fixes this.

dstibrany · 2017-04-10T21:18:29Z

@smfreegard That fix prevents all of the "'Error: We are already running hooks!" errors from being thrown, but the attachment is then not processed in my test case. Haraka reads the QUIT too early, then disconnects, at which point the data_post hooks do not run.

I updated the test case; my previous version was using a content-type of application/pdf which was skipping the unarchive code. The new version has a content-type of application/zip; if you send it in to the attachment plugin as is (with the above fix you provided), it skips the attachment. If you remove one line of text from the body, it processes the attachment and works fine.

min_test_case.txt

This issue seems to be related to having a body size > 16k and having the client send the QUIT early.

Let me know if there's any more info I can provide or any way that I can help.

smfreegard · 2017-04-11T08:40:30Z

@dstibrany - Ok reading back, your 'test-case' is broken though. You're making the SMTP Client send the ending dot and then immediately pipelining a QUIT straight after which a real SMTP client would never do. What should happen in that case is that Haraka shouldn't deliver the message because the client didn't wait for the response, it shouldn't process the attachment because the remote end has disconnected, so it would be wasting it's time to do so.

The 16Kb thing is probably what was already buffered by the kernel in a single packet, so splitting it would mean the QUIT would likely be received in a separate packet, allowing it to proceed through the attachment hooks before the QUIT is seen by Haraka.

In your test case - if this were a real SMTP client, it should retry the message again because it doesn't know the message was delivered, because it never received the '250' in response to sending the ending '.', so it should be retried again.

So - there's either a bug in your code, or, your attachment hooks are taking so long (e.g. usually this would be > 30s) that the remote clients are dropping the connections on a timeout. Looking at your logs should tell you this e.g.:

Apr 11 04:07:06 mail1-ewh haraka[25302]: [NOTICE] [544C61F5-BE96-41AD-BBCD-895F18082605.1] [core] message mid="1820796063.60037570.1491898011346.JavaMail.rocketman@push-dispatcher4" size=89417 rcpts=1/0/0 delay=10.825 code=CONT msg=""

or - if you don't have a message entry (this will only fire after hook_data_post completes):

Apr 11 04:07:11 mail1-ewh haraka[25302]: [NOTICE] [544C61F5-BE96-41AD-BBCD-895F18082605.1] [core] disconnect ip=50.115.211.91 rdns="mta391s1.r.groupon.com" helo="mta391s1.r.groupon.com" relay=N early=N esmtp=Y tls=Y pipe=N errors=0 txns=1 rcpts=1/0/0 msgs=1/0/0 bytes=89417 lr="" time=20.197

baudehlo · 2017-04-11T13:03:06Z

I still think this is a bug, not a broken test case. Yes the client SHOULD wait for the response to final dot, but even if it doesn't we shouldn't be processing the QUIT command before processing attachments finishes, assuming the connection is paused/resumed properly. I haven't had time to look into it yet. Maybe this easter weekend.

…

On Tue, Apr 11, 2017 at 4:40 AM, Steve Freegard ***@***.***> wrote: @dstibrany <https://github.com/dstibrany> - Ok reading back, your 'test-case' is broken though. You're making the SMTP Client send the ending dot and then immediately pipelining a QUIT straight after which a real SMTP client would never do. What should happen in that case is that Haraka shouldn't deliver the message because the client didn't wait for the response, it shouldn't process the attachment because the remote end has disconnected, so it would be wasting it's time to do so. The 16Kb thing is probably what was already buffered by the kernel in a single packet, so splitting it would mean the QUIT would likely be received in a separate packet, allowing it to proceed through the attachment hooks before the QUIT is seen by Haraka. In your test case - if this were a real SMTP client, it should retry the message again because it doesn't know the message was delivered, because it never received the '250' in response to sending the ending '.', so it should be retried again. So - there's either a bug in your code, or, your attachment hooks are taking so long (e.g. usually this would be > 30s) that the remote clients are dropping the connections on a timeout. Looking at your logs should tell you this e.g.: Apr 11 04:07:06 mail1-ewh haraka[25302]: [NOTICE] [544C61F5-BE96-41AD-BBCD-895F18082605.1] [core] message mid="1820796063.60037570.1491898011346.JavaMail. ***@***.***" size=89417 rcpts=1/0/0 *delay=10.825* code=CONT msg="" or - if you don't have a message entry (this will only fire after hook_data_post completes): Apr 11 04:07:11 mail1-ewh haraka[25302]: [NOTICE] [544C61F5-BE96-41AD-BBCD-895F18082605.1] [core] disconnect ip=50.115.211.91 rdns="mta391s1.r.groupon.com" helo=" mta391s1.r.groupon.com" relay=N early=N esmtp=Y tls=Y pipe=N errors=0 txns=1 rcpts=1/0/0 msgs=1/0/0 bytes=89417 lr="" *time=20.197* — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1410 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAobYwYxQ6rKM5zQDRwOPgVSBUo1NH8Vks5ruzx_gaJpZM4IDbym> .

dstibrany · 2017-04-11T15:21:05Z

Is it possible that the client is doing something bizarre? This issue does not happen with > 99% of our users, but happens consistently for those ones. Here's an example, the attachment processing took < 1s but the client still appears to be sending the QUIT early, or has already sent it.

2017-04-10T17:41:58.22501 [PROTOCOL] [7AFDF64E-85E2-44B7-9819-F2A7EC6F5ADC] [core] C: EHLO server.example.com state=1
... rest of the message
2017-04-10T17:41:58.70248 [DATA] [7AFDF64E-85E2-44B7-9819-F2A7EC6F5ADC.1] [core] C: .\r
2017-04-10T17:41:58.70288 [DEBUG] [7AFDF64E-85E2-44B7-9819-F2A7EC6F5ADC.1] [core] running data_post hooks
2017-04-10T17:41:58.70306 [DEBUG] [7AFDF64E-85E2-44B7-9819-F2A7EC6F5ADC.1] [core] running data_post hook in hdmailer plugin
2017-04-10T17:41:58.70384 [PROTOCOL] [7AFDF64E-85E2-44B7-9819-F2A7EC6F5ADC.1] [core] C: QUIT state=1
2017-04-10T17:41:58.70396 [DEBUG] [7AFDF64E-85E2-44B7-9819-F2A7EC6F5ADC.1] [core] running quit hooks
2017-04-10T17:41:58.70464 [ERROR] [7AFDF64E-85E2-44B7-9819-F2A7EC6F5ADC.1] [core] cmd_quit failed: Error: We are already running hooks! Fatal error!
2017-04-10T17:41:58.70703 [PROTOCOL] [7AFDF64E-85E2-44B7-9819-F2A7EC6F5ADC.1] [core] S: 421 Internal Server Error
2017-04-10T17:41:58.70734 [DEBUG] [7AFDF64E-85E2-44B7-9819-F2A7EC6F5ADC.1] [core] running reset_transaction hooks
2017-04-10T17:41:58.70747 [DEBUG] [7AFDF64E-85E2-44B7-9819-F2A7EC6F5ADC.1] [core] running disconnect hooks
2017-04-10T17:41:58.70761 [NOTICE] [7AFDF64E-85E2-44B7-9819-F2A7EC6F5ADC.1] [core] disconnect ip=x.x.x.222 rdns="unallocated-static.rogers.com" helo="server.example.com" relay=N early=N esmtp=Y tls=N pipe=Y errors=0 txns=1 rcpts=1/0/0 msgs=0/0/0 bytes=120808 lr="421 Internal Server Error" time=0.572

smfreegard · 2017-04-11T16:08:38Z

@dstibrany - that log excerpt shows that the client immediately sends QUIT after the dot, there is not even a seconds delay whilst it waits for the response. Whatever is doing that is terminally busted IMO and isn't a real SMTP client.

dstibrany · 2017-04-11T19:20:48Z

Yep it's super frustrating. Originally we had one customer whose client was behaving this way and I wrote it off as a one-off error that I'll never encounter again; then recently we had a second customer who exhibits the same behaviour so I started looking into it again.

What do you mean by "isn't a real SMTP client", though? I'm fairly sure that these users aren't doing anything fancy and are using some sort of email client to send this stuff in (although maybe the client they're using is busted).

baudehlo · 2017-04-11T19:25:45Z

The clients aren't doing anything wrong. See https://tools.ietf.org/html/rfc2920 section 4 the example even shows pipelining of the QUIT command. We have a bug.

…

On Tue, Apr 11, 2017 at 3:20 PM, Dave Stibrany ***@***.***> wrote: Yep it's super frustrating. Originally we had one customer whose client was behaving this way and I wrote it off as a one-off error that I'll never encounter again; then recently we had a second customer who exhibits the same behaviour so I started looking into it again. What do you mean by "isn't a real SMTP client", though? I'm fairly sure that these users aren't doing anything fancy and are using some sort of email client to send this stuff in (although maybe the client they're using is busted). — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1410 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAobY-ngjl32Ytpp0dyw_hrpPAk571Urks5ru9KQgaJpZM4IDbym> .

baudehlo · 2017-04-17T22:48:56Z

See the linked PR. It requires changes to whatever plugin you're using to process attachments - a pause when you enter the end state, and a resume when you're done. Plus there's a minor core change which appears to help.

dstibrany · 2017-04-18T14:56:49Z

Thanks for looking into that, I'll give this a try today.

dstibrany · 2017-04-18T16:25:45Z

Nice, that worked!

* Fix for issue #1410 * Remove logging * Confirmed that it works

* ignore undefined socket.remoteAddress (#1846) similar to #239 fixes #1835 * URL to manual was 404, point to Plugins.md (#1844) * smtp forward dest split routing (#1847) * split transactions with different forward routing * use tls_ini_section_with_defaults (#1848) * use net_utils.tls_ini_section_with_defaults() fixes #1691 * assure conn/tran still exists before storing results (#1849) * chore(package): update redis to version 2.7.0 (#1854) https://greenkeeper.io/ * fix(package): update ipaddr.js to version 1.3.0 (#1857) https://greenkeeper.io/ * Fix modifier regexp (#1859) * doc typo in config file name (#1865) Actual this is typo in plugin source - it uses "ini" file from another plugin called rabbitmq . So, in fact, no one of the plugins uses config file with name "rabbitmq_amqplib.ini" . This cause error (default values for the plugin) if you trying use the official documentation. * fix(package): update async to version 2.2.0 (#1863) https://greenkeeper.io/ * replace util.inherits with class .. extends (#1862) * replace the discouraged util.inherits with `class Foo extends Bar` syntax * chunkemitter: remove Buffer.concat shim for node < 0.7.11 * fsync_writesteam: moved into ./outbound (only required once) * tls_socket: always set `isServer=true` (an recent merge made it conditional) * removed test for cleartext.getCipher, it *should* always be present * Added RabbitMQ vhost support (#1866) * Add RabbitMQ vhost support Virtual Host. Start with "/". Leave blank or not use if you don't want to use virtual hosts. * Get pool (#1868) * pass cfg object instead, simplifying get_pool * more consistency with variable naming and passing * add integration test: send message w/smtp_client * tls consistency cleanups (#1851) * tls_socket: add options.SNICallback - refresh options.secureContext after setting a different cert in options - plugins/tls: add plugin.loadPemDir - replace some + ‘ ‘ + patterns with string interpolation - replace some `function (a,b) {` patterns with `(a,b) => {` * add very basic tests for tls_socket * move SNI into plugins/tls * revert * tls: remove interim variables (#1871) * - remove interim variables * remove unused _controlReleased * Split delivery relay (#1875) * do not defer relaying clients with split transactions * disable naïve comment stripping (#1876) * disable naïve comment stripping fixes #1874 * tickle travis * fix(package): update async to version 2.3.0 (#1878) https://greenkeeper.io/ * chore(package): update ocsp to version 1.2.0 (#1879) https://greenkeeper.io/ * Added missing 'default' keyword in rspamd plugin. Score wasn't added … (#1856) * Added missing 'default' keyword in rspamd plugin. Score wasn't added properly to header * Logging envelope from / recipients in lowercase in order to avoid changing ES analyzer ( slower performance ) * handle case where OCSP server is unavailable (#1880) this likely fixes #1855 * Add .editorconfig (#1884) * Removing typo from 'relay.md' (#1886) * Fix for issue #1410 (#1887) * Fix for issue #1410 * Remove logging * Confirmed that it works * fixed auth_proxy socket error (#1894) #1893 * Implement the missing upgrade method on SMTPClient. (#1901) * fix(package): update async to version 2.4.0 (#1896) https://greenkeeper.io/ * 1861 binary log (#1902) * Binary log file The log file without that replacement, is interpreted as binary file log issue. * Binary log replace regexp binary replace with code replace * remove variable, line wrap closes 1861 * Outbound split and move into folder (#1850) * Move outbound to its own sub-folder * Start of splitting up outbound into separate files * Progress on splitting * More progress, not working yet * Outbound now seems to work. Tests still fail. * Add missing file * Fix outbound.js test * Fix lint errors * Outbound-Tests for RFC3464 bounce messages reworked (#1889) * fix path to fsync_writestream (./ vs ../) * make sure list_re is defined before access (#1903) * Check error code instead of number. Fixes GH-1895. (#1899) * Bring client pool up to where we were before split (#1915) * Bring client pool up to where we were before split * Add changes from #1848 * Merge in class/extends changes * Fix all logging to be [outbound] prefixed * Fix self logger * Fix net_utils call * fix(package): update sprintf-js to version 1.1.0 (#1906) * F/outbound client certificates (#1908) * Add tls_options and associated tests to queue/smtp_forward * Fix exception when no tls blacklist added. Add tests for smtp_client. * Add documentation for outbound client_certificate * Fix path parsing bug on Windows platform (#1919) * Attempt to add some belt and braces around FIN (#1916) * Attempt to add some belt and braces around FIN * Add logging on write failure * Allow outbound pools to be disabled (#1917) * Allow outbound pools to be disabled * Few changes after actually testing the code * Fix for shutting down clients * Set graceful shutdown off by default (#1927) * Set graceful shutdown off by default Still allow haraka -c <path> --graceful Also sets graceful shutdown to shutdown in max 2xChildren amount of time it sets to shutdown. * Allow gracefulShutdown() to always be graceful Makes the tests pass * Fix comment * Run dumped logs through plugins not console (#1929) * Don't blow the stack on qstat (#1930) * Don't blow the stack on qstat * Should fix another call stack blowing issue * Allow "Unknown Result" and Socket Error to Try Next Host (#1931) * Allow "Unknown Result" and Socket Error to Try Next Host If multiple hosts are listed for clamd and an "unknown result" is returned or a socket error occurs after connection, then attempt to try remaining hosts before returning DENYSOFT. * Remove trailing spaces * Clear off some done tasks (#1928) * Clear off some done tasks * remove completed or abandoned TODO items * Check pool exists before delete (#1937) * Fixes loading outbound under cluster. (#1934) I don't really understand why this stuff broke. It seems to be a bug in node's require() caching, frankly. When loading modules they would return {} before actually being loaded. This appears to be due to recursive requires. This might fix #1933 Please test * Fix cluster messaging for node v6+ (#1938) * Fix undefined variable platformDOT in hmail.js (#1943) * Update qfile.js * Update hmail.js * Update index.js * Fix queue not loaded for single process (#1941) * Fix queue not loaded for single process * Update server.js * Use the sunset keyword in version specific code blocks (#1939) Because `sunset` is the keyword we'll grep for (if we follow our instructions) when doing a semver major release. * Fix PROTOCOL logs that have intermediate \n chars (#1947) Most noted in SpamAssassin output * Load logger in a setImmediate call (#1948) * Bump haraka-results required version (#1949) * Bump haraka-results required version * Update package.json * Fix undefined FsyncWriteStream var (#1953) * Fix undefined FsyncWriteStream var * Update hmail.js * Be more strict in attachment filename matching (#1957) * Be more strict in attachment filename matching This copes with filename="d'euvre" * Add a simple test * Use punycode domain (#1944) * Support SMTPUTF8 properly * Use punycode for mail from too * Enable SMTPUTF8 support * Transaction is null there. Wait until it's set * Add needed utf8 changes to connection.js * punycode for the other addresses * Fix trailing space * Be consistent with todo.domain (#1960) * Minor typo fix (#1963) Thanks * Update package.json (#1968) * Fix unaquired socket errors in test suite (#1971) * Support the new address-rfc2822 (#1970) * Support the new address-rfc2822 New module supports Groups, which have no host method * Force version of rfc2822 required by the fix * Fix missing backslash * RabbitMQ: fix encoding of user and password when creating connection string (#1964) * fix(package): update ipaddr.js to version 1.4.0 (#1972) * Fix link redirection (#1975) * require node LTS version (6+) (#1958) * require node LTS version (6+) * appveyor: install node v6 * whitespace changes for eslint 4 compat (#1979) * whitespace changes for eslint 4 compat * one more ws change * fix(package): update iconv to version 2.3.0 (#1981) * fix(package): update async to version 2.5.0 (#1982) * Add node v8 (#1951) * Add node v8 * enable node v8 testing, permitting v8 failures * import Plugins.md from v3 branch (#1991) * remove logging line (#1985) the prefix is not required in config/plugins as the message states, but it is required for an NPM packaged plugin that requires another NPM packaged plugin and is then spurious * doc: add note that smtp_forward only supports STARTTLS (#1988) * Update qmail-queue.js (#1997) qmail-queue binary Adds a Received: header with bare LF. Haraka sets CRLF by default resulting in mixed line endings which break header processing in some e-mail clients. messagestream can convert CRLF to LF on the fly when calling qmail-send. * rebuild blacklist upon file change (#1990) previously just added new entries fixes 1987 * Remove spurious logs (#1989) * replace plugins/log.elasticsearch with npm packaged (#2004) * replace access with npm packaged haraka-plugin-access (#1992) * use WRITE_EXCL from haraka-constants (#2011) * fix(package): update js-yaml to version 3.9.0 (#2002) * build_todo() is part of the outbound/index.js api (#2016) * rename xclient.hosts to match plugin & docs (#2014) fixes #1265 * correct the config file name to relay.ini (#2012) * fix(package): update semver to version 5.4.0 (#2015) * require node 8 tests to pass (#2017) now that node 8.2+ is on Travis * fixes #1765 (#2013) * Fix auth plugin failure when re-selecting auth method (#2000) When sending the following sequence of commands, the auth plugin fails since it tries to decode an undefined value as base64: AUTH LOGIN Z2lyaXNo 334 UGFzc3dvcmQ6 AUTH LOGIN 500 Unrecognized command # at this point plugin has failed Fixes #1999 * Improve logging UUID tracking in Outbound (#2018) * prep release 2.8.14 (#1932) * prep for next release * update Changes * improve Changes formatting * make the rest of the versions h2 * update Changes

dstibrany closed this as completed Apr 11, 2016

dstibrany reopened this Mar 21, 2017

baudehlo pushed a commit that referenced this issue Apr 17, 2017

Fix for issue #1410

75d12df

baudehlo mentioned this issue Apr 17, 2017

Fix for issue #1410 #1887

Merged

2 tasks

baudehlo closed this as completed in #1887 Apr 18, 2017

baudehlo added a commit that referenced this issue Apr 18, 2017

Fix for issue #1410 (#1887)

e305122

* Fix for issue #1410 * Remove logging * Confirmed that it works

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Haraka is throwing errors when the client sends QUIT too early #1410

Haraka is throwing errors when the client sends QUIT too early #1410

dstibrany commented Apr 8, 2016

baudehlo commented Apr 8, 2016

dstibrany commented Apr 11, 2016

baudehlo commented Apr 11, 2016

dstibrany commented Apr 11, 2016

smfreegard commented Apr 11, 2016

dstibrany commented Apr 11, 2016

baudehlo commented Apr 11, 2016

dstibrany commented Apr 11, 2016

baudehlo commented Apr 11, 2016

dstibrany commented Apr 12, 2016

dstibrany commented Mar 21, 2017

baudehlo commented Mar 21, 2017 via email

dstibrany commented Mar 21, 2017

baudehlo commented Mar 21, 2017 via email

dstibrany commented Mar 21, 2017

baudehlo commented Mar 21, 2017 via email

dstibrany commented Mar 21, 2017

dstibrany commented Mar 24, 2017

dstibrany commented Apr 10, 2017 •

edited

Loading

smfreegard commented Apr 10, 2017

dstibrany commented Apr 10, 2017 •

edited

Loading

smfreegard commented Apr 11, 2017

baudehlo commented Apr 11, 2017 via email

dstibrany commented Apr 11, 2017

smfreegard commented Apr 11, 2017

dstibrany commented Apr 11, 2017

baudehlo commented Apr 11, 2017 via email

baudehlo commented Apr 17, 2017

dstibrany commented Apr 18, 2017

dstibrany commented Apr 18, 2017

Haraka is throwing errors when the client sends QUIT too early #1410

Haraka is throwing errors when the client sends QUIT too early #1410

Comments

dstibrany commented Apr 8, 2016

Haraka version

Observed behaviour

Expected behaviour

Steps to reproduce

baudehlo commented Apr 8, 2016

dstibrany commented Apr 11, 2016

baudehlo commented Apr 11, 2016

dstibrany commented Apr 11, 2016

smfreegard commented Apr 11, 2016

dstibrany commented Apr 11, 2016

baudehlo commented Apr 11, 2016

dstibrany commented Apr 11, 2016

baudehlo commented Apr 11, 2016

dstibrany commented Apr 12, 2016

dstibrany commented Mar 21, 2017

baudehlo commented Mar 21, 2017 via email

dstibrany commented Mar 21, 2017

baudehlo commented Mar 21, 2017 via email

dstibrany commented Mar 21, 2017

baudehlo commented Mar 21, 2017 via email

dstibrany commented Mar 21, 2017

dstibrany commented Mar 24, 2017

dstibrany commented Apr 10, 2017 • edited Loading

smfreegard commented Apr 10, 2017

dstibrany commented Apr 10, 2017 • edited Loading

smfreegard commented Apr 11, 2017

baudehlo commented Apr 11, 2017 via email

dstibrany commented Apr 11, 2017

smfreegard commented Apr 11, 2017

dstibrany commented Apr 11, 2017

baudehlo commented Apr 11, 2017 via email

baudehlo commented Apr 17, 2017

dstibrany commented Apr 18, 2017

dstibrany commented Apr 18, 2017

dstibrany commented Apr 10, 2017 •

edited

Loading

dstibrany commented Apr 10, 2017 •

edited

Loading