Malware Analysis

Objectives: By the end of this topic, you will be able to…

  • Differentiate between static and dynamic analysis
  • Observe and document suspicious behaviors of a binary
  • Recognize indicators of compromise
  • Perform basic analysis in a safe and controlled manner

What is malware?

Malware (malicious software) is any program designed to damage, disrupt, steal, or exploit computer systems, networks, or data.

Common types

TypeDescription
TrojanDisguises as legitimate software to execute malicious actions
WormReplicates automatically through networks without user intervention
RansomwareEncrypts files and demands payment for their release
SpywareMonitors user activity without consent (e.g., keyloggers)
RootkitGains privileged access while hiding its presence
BotnetNetwork of infected computers remotely controlled for distributed attacks

Malware life cycle

Understanding the life cycle helps detect and mitigate attacks at different stages:

  1. Delivery: How malware reaches the system (email attachments, malicious links, USB, exploits)
  2. Execution: The malicious code runs on the victim system
  3. Persistence: Attempts to maintain presence after reboots or cleanup
  4. Command and Control (C2): Communication with the attacker for instructions
  5. Action: Data theft, file encryption, espionage
  6. Evasion: Techniques to avoid detection (obfuscation, encryption, sandbox detection)

Static analysis (without executing the binary)

Focuses on examining the malicious file without running it, making it safer in initial stages.

Common techniques start with reviewing metadata (creation date, author, hashes) and identifying the file type with file, binwalk, or readelf. A disassembler like Ghidra, IDA Free, or Radare2 reveals the code structure, while strings surfaces readable text — URLs, command strings, filenames — that may expose intent without execution. Inspecting the binary headers uncovers imports, exports, sections, and target architecture. Finally, comparing file hashes against databases like VirusTotal or Hybrid Analysis can immediately confirm whether the sample is a known variant.

Static analysis carries low risk of infection and is useful for gathering initial indicators, but it does not reveal dynamic behavior and can be defeated by obfuscated or packed malware.


Dynamic analysis (controlled execution)

Involves executing the malware in a safe environment to observe its behavior.

Typical environment: isolated virtual machines with snapshots and no direct internet connection.

The analyst watches for file, registry, or process modifications; outbound communications (IP address, domain, port, and protocol); persistence mechanisms such as scheduled tasks or startup file modifications; and unusual process behavior or resource usage that would be invisible to static inspection.

A few precautions are non-negotiable: never run real samples on production or daily-use systems, keep the VM network-isolated so any C2 traffic cannot reach the real internet, and take a snapshot before executing anything so the environment can be cleanly reverted.


Indicators of Compromise (IoC)

IoCs are traces or signals that indicate a system has been compromised. They include file hashes (MD5, SHA-256) of known malicious samples, suspicious filenames or filesystem paths, IP addresses and domains the malware communicates with, modified registry keys, and characteristic strings found inside binaries or running processes. These indicators are shared among security professionals through formats like STIX and platforms like MISP, enabling faster detection and coordinated incident response across organizations.


Recommendations for safe analysis

All analysis must happen inside a controlled environment — a dedicated virtual machine or sandbox — with shared folders disabled, unnecessary ports closed, and no direct internet connection, so that any outbound traffic from the sample stays contained. Take a snapshot before executing anything so the environment can be reverted cleanly afterward. Keep forensic and monitoring tools running throughout the session and maintain a logbook of every action taken, both for reproducibility and to avoid confusing your own activity with the malware’s. Finally, store samples under neutral names with safe extensions (.txt, .bin) to prevent accidental execution by the operating system.


Hands-on lab

Requirements: Kali Linux (isolated VM), provided sim_malware binary (compiled with gcc -o sim_malware sim_malware.c -s), Wireshark

Part 0: Set up a safe sandbox

Before touching any sample, configure your VM for isolation. The steps differ slightly depending on your hypervisor.

VirtualBox

  1. Isolate the network. Open Settings → Network for your Kali VM. Set Adapter 1 to Host-only Adapter (or Internal Network if you do not need host communication at all). This prevents any traffic from reaching the real internet while still allowing loopback and host-guest communication you may need for file transfers.

  2. Disable shared folders. Go to Settings → Shared Folders and remove any active shares. A shared folder is a direct path for malware to escape the VM onto your host filesystem.

  3. Take a clean snapshot. With the VM running and in a known-good state, go to Machine → Take Snapshot. Name it something clear like before-malware-lab. You will restore to this point after the exercise.

  4. Disable drag-and-drop and clipboard sharing. Go to Settings → General → Advanced and set both Shared Clipboard and Drag’n’Drop to Disabled.

VMware (Workstation / Fusion)

  1. Isolate the network. Open VM → Settings → Network Adapter. Select Host-only. This confines traffic to a virtual network that has no route to the real internet. If you need finer control, use Custom (VMnet2 or higher) and verify in the Virtual Network Editor that DHCP is enabled and NAT is disabled for that VMnet.

  2. Disable shared folders. Go to VM → Settings → Options → Shared Folders and set the folder sharing option to Disabled.

  3. Take a clean snapshot. Go to VM → Snapshot → Take Snapshot. Name it before-malware-lab. After the session, use Snapshot → Revert to Snapshot to restore the clean state.

  4. Disable drag-and-drop and clipboard sharing. Go to VM → Settings → Options → Guest Isolation and uncheck both Enable drag and drop and Enable copy and paste.

Verify isolation before proceeding

After configuring the above settings, confirm the VM cannot reach the internet:

ping -c 3 8.8.8.8
curl --max-time 5 https://example.com

Both commands should time out or fail. Only proceed to Part 1 once the network is confirmed isolated.


Part 1: Static analysis

  1. Run initial analysis:
file sim_malware
md5sum sim_malware
sha256sum sim_malware

Notice that file reports the binary as stripped — symbol names have been removed. This is standard practice in real malware to hinder analysis.

  1. Attempt to extract readable strings:
strings sim_malware | grep -E "127\.|/tmp|\.txt"

No results — the strings are not stored in plain text. This is why strings alone is unreliable for analyzing real-world samples.

  1. Inspect the disassembly and look for a decode loop:
objdump -d sim_malware | less

Because symbols are stripped, function names will not appear. You will also see several xor instructions early on that look like xor %ebp,%ebp or xor %ecx,%ecx — these are just the standard x86 idiom for zeroing a register and are not the loop you’re looking for.

Scroll further and look for a short, repeating loop that uses xor with a fixed numeric constant. It will look roughly like this:

mov    -0x20(%rbp),%rdx      ; load loop index (i)
mov    -0x8(%rbp),%rax       ; load pointer to encoded array
add    %rdx,%rax             ; rax = enc + i
movzbl (%rax),%eax           ; load one encoded byte
xor    $0x5a,%eax            ; decode: byte XOR key  ← the key is the constant here
mov    %eax,%ecx             ; save decoded byte
mov    -0x18(%rbp),%rdx      ; load pointer to output buffer
add    %rdx,%rax             ; rax = out + i
mov    %ecx,%edx             ; store decoded byte

What is XOR obfuscation? XOR is a bitwise operation where each bit is flipped if — and only if — the corresponding bit in a key is 1. Applying the same key twice cancels out, so byte XOR key XOR key = byte. Malware authors exploit this to encode sensitive strings (C2 addresses, filenames, commands) as meaningless byte arrays at rest. The binary decodes them at runtime — just in time to use them — keeping them invisible to static tools like strings.

The fixed value after the xor instruction (0x5a in the example above) is the key. Every byte in the encoded array was scrambled with that same value before the binary was compiled.

? Based on the disassembly, can you identify the XOR key used to encode the strings? What instructions in the loop reveal it?

Part 2: Dynamic analysis

Dynamic analysis bypasses obfuscation because the binary must decode its own strings at runtime in order to function.

  1. Open a listener in a separate terminal to receive the C2 beacon:
nc -lvnp 4444

Leave this running before executing the binary.

  1. In your original terminal, trace system and library calls during execution:
strace -o output_strace.txt ./sim_malware
ltrace -o output_ltrace.txt ./sim_malware
  1. Examine the traces for decoded strings and key events:
grep -E "openat|connect|execve|sendto" output_strace.txt
grep -E "fopen|system|connect|send" output_ltrace.txt

The IP address, file path, and outbound message now appear in plain text even though they were hidden from strings.

  1. Check what the netcat listener received — switch to that terminal. The message sent by the binary should have arrived.

  2. Open Wireshark, select the loopback (lo) interface, and apply the filter:

tcp.port == 4444

Re-run the binary with the listener active and inspect the full TCP exchange. Locate the PSH packet — this is the one carrying the payload sent by the binary.

? What message did the binary send to the listener? How does strace reveal strings that strings could not find? What does this tell you about the limits of static analysis?

  1. Monitor processes and open files:
netstat -anp
lsof -p $(pgrep sim_malware)
ps aux | grep sim_malware
  1. Check for dropped files:
find /tmp -newerct "10 minutes ago"
cat /tmp/payload.txt
  1. Check for persistence:
tail -5 ~/.bashrc

The binary appended a marker to your shell startup file. This simulates one of the most common Linux persistence techniques — modifying shell init files so the malware survives reboots and new terminal sessions.

? Why is ~/.bashrc a useful persistence target for malware? What other files or mechanisms could an attacker use to achieve persistence on a Linux system?

Part 3: IOC — Indicators of Compromise

Document every indicator you observed during Parts 1 and 2. Your table must include at least one entry per type:

TypeIndicatorObservation
Hash
File
Persistence
Process
Network
Payload

? Which of the IoCs you documented would be the most reliable indicator for detecting this malware on a different system? Which would be easiest for a malware author to change to evade detection?

Submission

Compressed folder with:

  • output_strace.txt, output_ltrace.txt, captures from strings, objdump
  • Screenshots of netstat, ps, Wireshark traffic, and the netcat listener output
  • Completed IOC table in .md, .csv, or .pdf
  • Brief written reflection (less than 1 page)

Key concepts

TermDefinition
MalwareMalicious software designed to damage or compromise systems
RansomwareMalware that encrypts files and demands payment to release them
TrojanMalware disguised as legitimate software
IoCObservable evidence that a system has been compromised
Static analysisExamination of binaries without executing them
Dynamic analysisObservation of behavior during execution

Navigation:Previous | Home | Next