The activities of malware are increasing day by day. There are security solutions such as EDR, anti-virus, anti-malware and sandbox to prevent the activities of malicious software. However, the success rate of sandboxes, one of the most effective malware analysis products, is increasing day by day.
Malwation AIMA extract the configurations of malware families with new updates as well as the extra features it offers, and these configurations are critically important IOCs. Today, we can tell you that AIMA has stably extracted configurations of dozens of malware families. And we show how to extract the configurations of the GuLoader malware among them by writing script in Python and we present it to the open source world.
GuLoader (also known as CloudEye) is a Loader type malware written in the Visual Basic language. It downloads and runs RAT and Stealer type malwares such as AgentTesla, NetWire, Formbook from the remote server to the victim’s system. Malwares that are downloaded and run from the remote server usually located on Google Drive and OneDrive.
Sophisticated malwares that continue to operate today often resort to many obfuscate and packaging processes in order to avoid security products and complicate the analysis process of malware analysts. As such, we can do the configuration extraction process from the healthiest memory dump.
First Part: Robust Analysis and Detection
If you want to extract configurations of a malware family, the most important thing to do is to continue the analysis stage very well and dump memory on several instances of the malware family that have identical versions. If you work on different versions, the scripts you have written will only be working on the sample you are analyzing, not with the corresponding version of the malware family, which is not a scenario we want.
After obtaining several different samples of the same malware family with the same version, we perform the analysis steps for each. We take note of the configuration data.
As a result of the analysis, the configuration data that can be extracted from this version of the GuLoader family are as follows:
After all the valuable information described above, we are dumping all the malware samples. At this point, asynchronous memory dumps give healthier results instead of synchronous memory dumps. Since the processes on the memory progress very quickly, you may experience data loss depending on time, so it is necessary to dump asynchronously. AIMA’s built-in advanced memory dump engine does our job and we get our memory dump in a healthy way.
We reached certain configurations as a result of our previous analysis. Now we’re drawing our road map.
As can be seen in the images below, we have identified the remote server addresses from which two different samples from the GuLoader family with the same version will download. When several different examples were examined, it was understood that the “0xFF 0xFF 0x68 0x74 x74 0x70” pattern could be used in the relevant version of GuLoader.
If we were based only on the “0x68 0x74 x74 0x70” pattern, we would detect all strings that start with “http” as a remote server, which would significantly increase our false-positive rate.
We have reached the largest and perhaps the only configuration of the GuLoader family, but as a result of the analysis, we have also determined that this version of the malware contains different configurations. This configurations;
As a result of the analysis, the permanence mechanism is divided into two in this version of the GuLoader family. The first type of GuLoader instance drops the VBA script to the systems TEMP directory, whose only job is to run a copy of itself. The second type of GuLoader example drops a copy of itself into the system’s user directory and runs it through the registry. We need a good concept of these two differences because we will write our Python script accordingly.
After all the valuable configurations we find and the roadmap we have created, we can now automate things.
At this point, we first need to write a function that parses the remote server URL, which is the configuration critical to us. Then, the function extracting the User-Agent, which will be included in the HTTP header to connect with the remote server, the function that extracts the path to the targeted registry, the function that extracts the data set in the registry, detects whether the malware is Type 1 or Type 2. According to the function and the type of the malware, we have drawn our roadmap before writing the Python script, including the path and the name of the system directory to which it drops itself.
In the above parseURL function, we search the memory dump for the pattern that we have extracted by examining the memory dump. Then we move to the starting point of the pattern with File Pointer. (This is the starting offset of the remote server).
We read a character from the offset we are in and append every character we read to the series called “zararli”. When our loop reads the “0x00” byte, it stops and we come to the end of the remote server address. Then we convert the remote server address, which is one character in the array, into a string and return it to our main function.
Don’t be confused by the delInvalidData function here. It only deletes characters that are interfering and not found in the ASCII table. You can do the same by passing the errors = “ignore” parameter to the decode () function in Python, but we try to write the script in a structure close to C language and try not to skip the details.
We use the same operations we do in our function that parses the remote server while parsing the User-Agent. Naturally, this function has a separate pattern.
One of the configurations was the target registry path to provide persistence. We repeat the same processes with the pattern we analyze and extract from the memory dump. This function also shows us the targeted registry path.
Notice we used Python’s re library to find the pattern compile and matching data. You can use the find () function directly, but using regular expressions will be advantageous in many places.
After finding the registry path, we need to parse the entered key in the targeted registry. If you remember, the configurations we aimed to remove included the registry key.
This time we show you how to extract the registry key using the find () function to show the difference between re and find (). This time, we understand that we have come to the beginning of the configuration with the bytes “0xFF 0xFF”. That’s why we are doing two byte reads, and we are doing a backward reading by removing the File pointer by 1. Then we read up to the “0x00” byte in a classical way, delete non-ASCII characters and return the parsed registry key to our main function.
Now all that remains is to learn the persistence type of the malware. After that, we will parse the name in which folder according to its type.
As you can see in the image above, if the malware has the relevant pattern, it is Type 2, if not, it is Type 1. Now, we will write the functions that parse both the created folder name and the name of the malware from the memory dump according to Type 1 and Type 2.
As can be seen in the figures above, there is no secondary VBA script because the malware with Type 1 provides persistence over the registry. The path of the malware is written directly in the registry. However, the malware with Type 2 gives the path of the VBA script to the registry. And VBA script is running at system startup. VBA script also runs the malware with its payload.
In the Type 1 malware, the name of the executable and the name of the folder in which it is located are included in the bytes under the registry configurations in the memory dump, respectively.
In the Type 2 malware, the payload, the name of the executable and the name of the folder in which it is located are included in the bytes under the registry configurations in the memory dump, respectively.
Although the patterns of both types are the same, we just write a few additional code snippets and extract the necessary configurations. In the image below, you can see the output of AIMA’s integrated Config Extractor module.
We are waiting for your feedback and see you in our next Extraction article, we say goodbye.