Reverse Engineering

Objectives: By the end of this topic, you will be able to…

  • Identify key functions and strings within a binary
  • Use tools such as objdump, gdb, and Cutter to inspect machine code
  • Understand the basic execution flow without access to source code
  • Recognize limitations, risks, and ethical aspects of reverse engineering

What is reverse engineering and what is it used for?

Reverse engineering is the process of analyzing the internal workings of a program or system without access to its original source code, in order to understand its behavior, architecture, or potential weaknesses.

In cybersecurity, reverse engineering is used to analyze malware and determine its purpose, identify security vulnerabilities, and validate software integrity through auditing. It also plays a role in forensic investigations after incidents, in studying protection mechanisms such as anti-debugging and obfuscation, and in ensuring compatibility with legacy or proprietary software. It is a common skill in malware analysis, exploit development, and security auditing.


Differences between static and dynamic analysis

Static analysis

Static analysis is performed without running the program: the binary file is examined directly with a disassembler, which lets the analyst review assembly code, functions, strings, and data structures. It is safer than running unknown code, though raw assembly can be harder to interpret without context.

Common tools: strings, objdump, Ghidra, IDA Free

Dynamic analysis

Dynamic analysis executes the binary in a controlled environment — a sandbox or virtual machine — and observes its real behavior: system calls made, network connections initiated, files read or written. This is especially useful for detecting hidden behavior or dependencies that are only resolved at runtime.

Common tools: gdb, strace, ltrace, radare2, Cutter


Executable formats

ELF (Executable and Linkable Format)

ELF is the standard executable format on Linux systems. It organizes code and data into named sections such as .text (instructions), .data (initialized variables), .bss (uninitialized variables), .plt, and .got, and it may include debugging symbols and exported function names.

PE (Portable Executable)

PE is the executable format on Windows (.exe, .dll). It follows a similar conceptual structure to ELF — with a DOS header, section table, and import/export tables — and is widely used by malware developers targeting Windows environments.

Key difference: ELF is native to Unix/Linux, PE is native to Windows. Analysis tools differ between platforms, though concepts like sections, imports, and headers are similar.


Disassemblers and debuggers

Disassemblers

Translate machine code into assembly language to understand what a binary does at a low level.

objdump -d binary
radare2 -A binary

Ghidra provides both disassembly and pseudocode decompilation.

Debuggers

Allow running a binary step by step, using breakpoints to pause execution at critical points, and inspecting memory, registers, and call stack.

gdb binary

Basic debugging commands:

  • break <function|address> — set a breakpoint
  • run — start execution
  • next / step — step through execution
  • info registers / x / print — inspect state

Function analysis and conditional logic

Function identification

Functions can be located via symbol tables, prolog/epilog patterns, or cross-references. Tools like Ghidra detect them automatically.

Every function begins with a prolog that sets up the stack frame (push ebp, mov ebp, esp) and ends with an epilog that restores it and returns (pop ebp, ret). Between them, calls (call) transfer control to subroutines, while jumps (jmp, je, jne) implement conditional branching, often preceded by a comparison (cmp).

Conditional analysis

Program decisions are represented as blocks with comparison instructions (cmp) and conditional jumps (je, jne, jg, jl). Understanding this logic helps identify critical branches: password validation, flow decisions, anti-analysis tricks.

Common techniques include following the execution flow with gdb or a graphical tool, focusing analysis on the code around main, printf, system, and network or system calls, and tracing how variables and buffers move on the stack.


Ethics and legality of binary analysis

Reverse engineering can have legitimate uses (auditing, research, learning), but it can also violate:

  • Copyright laws (restrictive licenses)
  • Terms of use (EULA)
  • Cybercrime laws if done without consent

In practice, this means only analyzing software from legitimate sources or binaries created for educational purposes, avoiding distribution of modified code without permission, and documenting all analysis for academic or security-improvement purposes. All technical practices must be backed by clear ethical principles and explicit authorization.


Hands-on lab

Requirements: Kali Linux, gdb, Ghidra or Cutter

In this lab you will analyze a binary that asks the user for a password. Your goal is to find the correct password without access to the source code, using only reverse engineering tools and techniques.

Part 0: Download the binary

  1. Download the crackme from crackmes.one. You will get a .zip file.

  2. Extract it. The zip file is password-protected — the password is crackmes.one:

unzip crackme.zip
  1. Make the binary executable:
chmod +x passguess

You should now have the passguess binary ready to analyze.

Part 1: Initial reconnaissance

Before opening any advanced tool, gather basic information about the binary.

  1. Determine the file type. Run the following command and note whether the binary is 32-bit or 64-bit, statically or dynamically linked, and whether symbols have been stripped:
file passguess

Write down the architecture (e.g. ELF 64-bit LSB executable, x86-64) — you will need this to choose the correct analysis mode later.

  1. Extract readable strings. Many binaries contain plaintext strings (prompts, error messages, hardcoded values). Search for them:
strings passguess | less

Look for anything that resembles a user-facing message (e.g. "Guess The Pass", "OK", "ERROR"). Also look for any short, suspicious strings that could be a hardcoded password or key. Write down every string that seems relevant. Note that strings will also show internal compiler artifacts and memory alignment data — not every short string is meaningful. You will confirm which one is the actual password in later parts.

  1. Inspect the disassembly for key functions. Get a raw disassembly listing and search for well-known function names:
objdump -d passguess | less

Inside less, press / and type main to jump to the main function. Also search for calls to library functions like strcmp, printf, puts, and scanf. These tell you what the program does: reads input, compares it, and prints a result.

Checkpoint: At this point you should have a rough idea of what the program does — it reads a password from the user and checks it against something. You may have already spotted the answer in the strings output. The next parts will confirm your hypothesis.

Part 2: Decompilation with Ghidra

Ghidra can reconstruct approximate C source code (pseudocode) from a binary, which is much easier to read than raw assembly.

  1. Open Ghidra and create a new project (File → New Project → Non-Shared Project). Give it any name.

  2. Import the binary. Go to File → Import File and select passguess. Ghidra will auto-detect the format and architecture. Click OK.

  3. Run the auto-analysis. When prompted with “program has not been analyzed. Would you like to analyze it now?”, click Yes and accept the default analyzers. Wait for the analysis to complete (progress bar at the bottom right).

  4. Navigate to main. In the Symbol Tree panel on the left, expand Functions and click on main. The Listing panel will show the disassembly and the Decompile panel will show the pseudocode.

  5. Read the pseudocode carefully. You should see something similar to this structure:

    • A local buffer (array) is declared on the stack (e.g. local_118)
    • printf prints a prompt asking for input
    • scanf reads user input into the buffer
    • strcmp compares the buffer against a hardcoded string
    • An if/else prints either a success or failure message
    • You may also see a __stack_chk_fail() call near the end — this is an automatic stack canary check inserted by the compiler to detect buffer overflows. You can ignore it for this exercise.
  6. Rename variables for clarity. Ghidra generates placeholder names like local_118 or iVar1. You can rename them to make the pseudocode easier to read: right-click a variable name and select Rename Variable (or press L). For example:

    • Rename the buffer (e.g. local_118) to user_input
    • Rename the strcmp return value (e.g. iVar1) to password_match
  7. Identify the comparison. Look at the strcmp call. It takes two arguments:

    • The buffer containing user input
    • A string literal — the hardcoded password

The strcmp function returns 0 when both strings are equal. Depending on the decompiler, you may see the check written as iVar1 == 0 (Ghidra style) or !strcmp(...) (IDA style) — both mean the same thing: the success branch executes when the strings match.

  1. Extract the password. The second argument to strcmp is the correct password. Write it down.

  2. Verify your finding. Run the binary and enter the password you found:

./passguess

You should see the success message. If you do, you have successfully reverse engineered the binary.

Part 3: Dynamic analysis with gdb

Even though you already know the password, use gdb to confirm the finding dynamically and practice debugging skills.

  1. Start the debugger:
gdb ./passguess
  1. Break at main and run:
(gdb) break main
(gdb) run
  1. Disassemble main to find the strcmp call. Since the binary has no debug symbols, we cannot step line-by-line with next. Instead, we will find the exact address of the strcmp call and set a precise breakpoint:
(gdb) disassemble main

Scroll through the output and look for a line containing call and strcmp (it may appear as strcmp@plt). Note the address on the left side of that line (e.g. 0x00005555555551a8).

  1. Set a breakpoint at the strcmp call address and continue. Replace the address below with the one you found. This ensures you only break on the password comparison, not on internal library calls to strcmp:
(gdb) break *0x00005555555551a8
(gdb) continue

The program will print the prompt and wait for input. Type a test password (e.g. hello) and press Enter. Execution will pause right before the strcmp call that checks your password.

  1. Inspect the arguments. On x86-64, function arguments are passed in registers: rdi holds the first argument and rsi holds the second. Examine them as strings:
(gdb) x/s $rdi
(gdb) x/s $rsi

One of these will show the password you typed, and the other will show the hardcoded password from the binary. This confirms what you found in Ghidra.

  1. Inspect the return value. Step into strcmp and then use finish to let it complete. The return value will be in rax:
(gdb) stepi
(gdb) finish
(gdb) info registers rax

If rax is 0, the strings matched. Any other value means they differ.

Part 4: Visual flow analysis with Cutter

Cutter (or the Ghidra graph view) provides a visual representation of the program’s control flow, making it easy to see decision points.

  1. Open the binary in Cutter (or use Ghidra’s Function Graph view: Window → Function Graph).

  2. Navigate to main using the functions list or the search bar.

  3. Switch to the graph view. You should see a flowchart with blocks connected by arrows:

    • A block that calls scanf to read input
    • A block that calls strcmp to compare strings
    • A conditional branch that splits into two paths:
      • Success path → prints the “OK” message
      • Failure path → prints the “ERROR” message
  4. Take a screenshot of the graph. Annotate it to show:

    • Where user input is read
    • Where the comparison happens
    • Which branch leads to success and which leads to failure

The graph makes it visually obvious that there is exactly one condition that determines the outcome — matching the hardcoded password.

Part 5 (Optional): Patching the binary with radare2

Instead of finding the password, you can modify the binary so that any password is accepted.

  1. Open the binary in write mode:
cp passguess passguess_patched
radare2 -w ./passguess_patched
  1. Analyze and navigate to main:
[0x00...]> aaa
[0x00...]> s main
[0x00...]> pdf
  1. Find the conditional jump. Look for a je (jump if equal) or jne (jump if not equal) instruction near the strcmp call. This is the instruction that decides which branch to take.

  2. Patch the instruction. There are two common approaches:

    Option A — Flip the condition. If the instruction says je, change it to jne, or vice versa. This inverts the logic so wrong passwords are accepted and the correct one is rejected:

    [0x00...]> s <address of the jump instruction>
    [0x00...]> wa jne
    

    Option B — NOP the jump. Replace the conditional jump with a NOP (no-operation) instruction. This removes the branch entirely, so execution always falls through to the success path:

    [0x00...]> s <address of the jump instruction>
    [0x00...]> wa nop
    

    Both techniques achieve the goal of bypassing the password check. Option A inverts the logic; Option B eliminates the decision entirely.

  3. Save and exit:

[0x00...]> quit
  1. Test the patched binary. Run it and enter any random password:
./passguess_patched

The program should now accept any input as correct (or reject the real password). This demonstrates how a single byte change can completely alter program behavior.

Submission

Write a short report (PDF or Markdown) containing:

  1. Reconnaissance results: output of file and strings, with the relevant strings highlighted
  2. Pseudocode analysis: a screenshot of the Ghidra decompiler output for main, with annotations explaining each line
  3. The password you found and a screenshot of the program accepting it
  4. gdb session: show the output of x/s $rdi and x/s $rsi at the strcmp breakpoint, confirming the hardcoded password
  5. Flow graph: an annotated screenshot from Cutter or Ghidra showing the two branches
  6. (Optional) If you completed Part 5, show the patched instruction and demonstrate the modified behavior
  7. Reflection: In your own words, explain why hardcoding passwords in a binary is insecure, and suggest a more secure alternative

Key concepts

TermDefinition
Static analysisExamination of code or binaries without executing them
Dynamic analysisAnalysis of behavior during execution
ELFStandard executable format on Linux systems
PEExecutable format on Windows systems
GDBEssential command-line debugger for binary analysis

Navigation:Previous | Home | Next