Thursday, June 10, 2010

Decrypting mojibake in Linux

Today I received an e-mail, which contained something completely unreadable.

=D0=9D=D0=B5 =D0=BF=D1=8B=D1=82=D0=B0=D0=B9=D1=82=D0=B5=D1=81=D1=8C =D0=BE=
=D1=82=D0=B2=D0=B5=D1=87=D0=B0=D1=82=D1=8C =D0=BD=D0=B0 =D1=8D=D1=82=D0=BE =
=D0=BF=D0=B8=D1=81=D1=8C=D0=BC=D0=BE! =D0=9E=D1=81=D1=82=D0=B0=D0=B2=D0=BB=
=D1=8F=D0=B9=D1=82=D0=B5 =D0=BA=D0=BE=D0=BC=D0=BC=D0=B5=D0=BD=D1=82=D0=B0=
=D1=80=D0=B8=D0=B8 =D0=BD=D0=B0 =D1=81=D1=82=D1=80=D0=B0=D0=BD=D0=B8=D1=86=
=D0=B5 =D1=8D=D1=82=D0=BE=D0=B9 =D0=B7=D0=B0=D1=8F=D0=B2=D0=BA=D0=B8

When using Windows, I get used to use the "Stirlitz" utility (ru.wikipedia.org) to automatically decrypt mojibake (or "krakozyabry" as we call the stuff in Russian).

As far as I know, all these equal signs, "A"s, "B"s, "C"s, etc., are used in the so called "Quoted-printable" encoding.

At my OpenSUSE 11.2 I found a small utility called recode that is well suitable to solve the task. I saved an e-mail as 66.eml and pipelined it to recode. The latter successfully decoded QP into UTF-8:

$> cat 66.eml | recode /qp