c++ - Handling blank lines in email headers -
came across few mails aren't rfc compliant
authentication-results: spf=none (sender ip ) smtp.mailfrom=**@********.**; content-type: multipart/related; boundary="_004_2039b206f2a54788ba6a101978bd3f82dbxpr07mb013eurprd07pro_"; type="multipart/alternative" mime-version: 1.0
for example, mail above has blank line in header (before content-type). libraries strictly abide rfc (for example https://github.com/mikel/mail), won;t able parse them. apple mail, thunderbird manage handle such mails.
have tried browse through thunderbird's codebase, being unfamiliar c++, managed find https://github.com/mozilla/releases-comm-central/blob/1f2a40ec2adb448043de0ae96d93b44a9bfefcd1/mailnews/mime/src/mimemsg.cpp
can point me part of thunderbird's codebase mail parsing happens, or opensource libraries/apps handle such non complaint mails.
edit:
hexdump of blank line. contains space.
00013e0: 2a2a 2a2a 2a2a 2e2a 2a3b 0d0a 200d 0a43 ******.**;.. ..c 00013f0: 6f6e 7465 6e74 2d54 7970 653a 206d 756c ontent-type: mul 0001400: 7469 7061 7274 2f72 656c 6174 6564 3b0d tipart/related;.
the ruby code in referenced ruby library not confirming rfc, allows multiple lines folded single header line. rule continuation header line (folding headers) should start space -- exact details in rfc 5322, section "folding white space , comments".
the problem ruby code reading each line , trimming white spaces before parsing -- failing detect line in fact belonging previous header -- line not add header (as contains space), valid syntax.
edit:
the non compliant behaviour introduced in commit 17783f8536fc09b926c7425dbacfc35e0e851ef5. 1 of side effects introduced splitting headers & body on empty folded header
crlf = /\r\n/ white_space = %q|\x9\x20| wsp = /[#{white_space}]/ header_part, body_part = raw_source.split(/#{crlf}#{wsp}*#{crlf}(?!#{wsp})/m, 2)
the issue raised in commit a2a45597bce66ebe788cedaaab848a37bd04b25a, consensus not break existing behavior.
Comments
Post a Comment