Digital Forensics

Objectives: By the end of this topic, you will be able to…

  • Properly preserve digital evidence
  • Use forensic tools in Kali to analyze disks or files
  • Interpret relevant information from a forensic image
  • Document findings clearly and in a structured manner

Introduction to digital forensic analysis

Digital forensics is a cybersecurity discipline that identifies, preserves, analyzes, and presents digital evidence with legal validity. It is applied in cases of security incidents, fraud, intrusions, or legal investigations.

The goal of forensic analysis is not only to understand what happened, but also to preserve evidence for potential judicial use.

Scenario: A university discovers that login times for a departmental server look unusual — SSH sessions are opening at 3 a.m. from an internal IP that belongs to a student who graduated last semester. The IT team suspects an unauthorized account is being used, but they must determine what happened without altering any potential evidence. This is a digital forensics investigation.


Evidence preservation

Before analyzing any compromised system, it is essential to preserve the integrity of the evidence. The first step is always to create a forensic image — a bit-for-bit copy of the disk or device using tools like dd, dcfldd, or Guymager — so that all analysis is performed on the copy rather than the original. A hash (MD5, SHA-1, or SHA-256) is computed before and after acquisition to verify that no alteration occurred during the process. The original media is connected through a write blocker or mounted read-only to prevent any accidental modification. Every action taken from this point forward is recorded in a chain of custody log, documenting who handled the evidence, when, and what was done, so that the findings remain admissible in a legal proceeding.


Types of digital evidence

SourceExamples
FilesDeleted documents, modified or malicious files
System logsEvent logs (Windows Event Logs, syslog)
MetadataCreation/modification dates, author, location
Network/application logsAccess records, errors, connections
User artifactsBrowsing history, recent files, executed commands
RAM memoryActive processes, passwords, connections

Forensic image analysis

Analysis is performed on an exact copy (image) of the original device. A full disk or partition image enables filesystem analysis (NTFS, ext4, FAT), recovery of deleted files, and search for suspicious patterns without touching the original. A RAM memory dump captures volatile data — running processes, encryption keys, active connections, and other artifacts that disappear the moment the system powers off. Removable device images (USB drives, SD cards) are examined when there is reason to believe data was exfiltrated or malware was introduced through portable media.

How deleted file recovery works

On most filesystems, deleting a file removes the directory entry and marks the inode as unallocated, but the data blocks are not immediately overwritten. As long as those blocks have not been claimed by a new file, their content is still present on disk and can be recovered.

The behavior differs by filesystem. On FAT and ext2, the inode’s block pointers survive deletion, so a forensic tool can locate the exact blocks and reconstruct the file directly. On ext4 with journaling (the default), the kernel zeroes the inode’s block pointers as part of the deletion transaction, which makes inode-based recovery unreliable — the data blocks may still exist, but there is no pointer to them. Recovery then falls back to file carving: scanning raw disk space for known file headers and footers (JPEG, PDF, ZIP). Plain text files and shell scripts have no magic bytes, so carving cannot recover them.

Timestamps and timeline reconstruction

Every file on a Linux filesystem carries four timestamps, collectively called MAC times:

TimestampMeaning
mtime (Modified)Last time the file’s content changed
atime (Accessed)Last time the file was read
ctime (Changed)Last time the inode metadata changed (permissions, ownership, links)
crtime (Created)File creation time (ext4 only)

Reconstructing the order of events from these timestamps is called timeline analysis and is one of the primary techniques for establishing when actions took place. Inconsistencies — such as a file whose crtime is later than its mtime — can indicate tampering.

Analysis tools

ToolUse
Autopsy / Sleuth KitStructured examination of forensic images
Bulk ExtractorPattern extraction (emails, URLs, keys)
Binwalk / Foremost / ScalpelFile recovery and binary data analysis
ExifToolMetadata analysis

Anti-forensics

Suspects often attempt to hinder investigations before or after an incident. Common techniques include:

  • Clearing shell history (history -c, unset HISTFILE) to erase traces of executed commands
  • Deleting files with rm — and on some filesystems, using shred or wipe to overwrite blocks before deletion
  • Timestamp manipulation with touch to make a file appear older or newer than it actually is
  • Log tampering — editing or truncating system logs to remove evidence of login sessions or commands

Recognizing anti-forensic activity is itself part of the investigation. An empty bash history, a log file with an unexpected gap, or a file whose timestamps are inconsistent with surrounding events can all be indicators of deliberate tampering — and that inconsistency is evidence.


Forensic reporting

The outcome of a forensic investigation is a forensic report: a structured, reproducible document that communicates findings to a technical or legal audience. A well-written report includes:

  • Case information: examiner identity, evidence identifiers, acquisition date and method
  • Hash verification: proof that the analyzed copy matches the original
  • Methodology: tools used and steps taken, in enough detail that another examiner could reproduce the results
  • Findings: recovered files, relevant artifacts, timeline of events
  • Incident hypothesis: a narrative explaining what likely happened, supported by specific evidence

Every claim in the report must be traceable to a specific artifact or tool output. Speculation without evidentiary support undermines the report’s credibility — and its admissibility in a legal proceeding.


Forensic tools in Kali Linux

Kali includes tools for performing forensic tasks without altering the evidence:

  • dcfldd / dd — forensic imaging
  • hashdeep / sha256sum — integrity calculation and verification
  • autopsy / sleuthkit — structured analysis suite
  • foremost / scalpel — deleted file recovery
  • strings — readable text extraction

Kali can be booted in forensic mode (from USB), which avoids automatically mounting disks to prevent evidence alteration.


Hands-on lab

Requirements: Kali Linux, forensic image disco.dd (provided by instructor — ext4, ~100 MB)

Getting the disk image: The image is distributed as three parts on Moodle (disco.dd.part_01, disco.dd.part_02, disco.dd.part_03). Reassemble them before starting:

cat disco.dd.part_01 disco.dd.part_02 disco.dd.part_03 > disco.dd

Scenario

On March 8, 2026, the CISO of Universidad Central received an automated alert from the institution’s SIEM platform. The alert flagged repeated SSH connections to the departmental server (srv-dept01, 192.168.1.1) between 03:00 and 03:15 AM on three consecutive nights — March 5, 6, and 7. The source IP was 192.168.1.45, an internal address registered to a student laptop. The account used was mvalencia, belonging to Miguel Valencia, a student who graduated in December 2023.

According to the offboarding policy, Valencia’s account should have been disabled within 30 days of graduation. A pending audit item from the February 28 team meeting confirms the task was never completed.

The IT Security team immediately isolated srv-dept01, imaged its filesystem, and took the machine offline. No changes were made to the original disk after imaging. You have been brought in as the forensic analyst. Your objective is to reconstruct what happened, recover any deleted evidence, and produce a structured incident report.

You are working on copia.dd, a verified bit-for-bit copy of the original image. Do not modify or analyze disco.dd directly.

Part 1: Evidence preservation

  1. Calculate the original hash:
sha256sum disco.dd
  1. Create a working copy (dcfldd computes and logs the source hash automatically; install if needed):
sudo apt install -y dcfldd   # skip if already installed
dcfldd if=disco.dd of=copia.dd hash=sha256 hashlog=hashes.txt
  1. Verify the working copy matches the original:
sha256sum copia.dd

Compare this output against the hash from step 1 and against the reference hash provided by your instructor. All three must match. You can also inspect hashes.txtdcfldd recorded the source hash there automatically.

Note

Your instructor recorded the SHA-256 of disco.dd at creation time. They will share it after you compute yours in step 1 — this guards against the possibility that the distributed image was already corrupted before you received it.

Question

If the hashes of the original image and the working copy did not match, what would that mean for the admissibility of the evidence in a legal proceeding? What step would you take next?

Part 2: File extraction and analysis

Filesystem listing with Sleuth Kit:

fls -r copia.dd > structure.txt
cat structure.txt

Files marked with * are deleted. The (realloc) tag means the inode has been freed and is available for reuse, but the data blocks are still intact and recoverable. Note the inode numbers next to deleted entries — you will need them to recover the files.

Inspect metadata for a specific inode:

istat copia.dd <inode>

Question

Run istat on each deleted inode. Look at the crtime and mtime timestamps. How do they align with the suspicious sessions recorded in auth.log? What does that correlation tell you about when and how the files were created?

Recovering deleted files (ext4):

foremost and scalpel work by recognizing file headers and footers. They are effective for JPEG, PDF, ZIP, and other typed formats, but they will not recover plain text files or shell scripts because those have no magic bytes.

For this image, recover deleted files by inode using icat. Take the inode number from the deleted entries in the fls listing:

icat copia.dd <inode> > recovered_file

For example, if fls showed r/r * 23(realloc): credentials.txt, the inode is 23:

icat copia.dd 23 > recovered_credentials.txt

Note

extundelete is another ext4 recovery tool that reads the filesystem journal to reconstruct deleted files. It is useful in live-incident scenarios but depends on how the deletion was performed and whether the journal still holds the relevant records. For this lab image, use icat directly.

Text and metadata inspection:

strings copia.dd | grep -iE "(password|user|ssh|credential|http)"

Search for credentials, messages, internal paths. Pipe through less if output is too long.

Question

The strings output contains an IP address and a URL. What do these suggest about the external infrastructure used in the attack? What role does each artifact play — the scp destination, the curl URL, and the netcat connection?

Log and artifact inspection:

Use fls to locate log files and user artifacts, then extract them with icat. Key locations in this image:

# Find inodes for files of interest from the fls listing
grep -E "auth\.log|bash_history|passwd" structure.txt
 
# Extract and read them
icat copia.dd <inode> | less

Pay attention to SSH login timestamps, usernames, source IPs, and sudo usage in auth.log, and to the command sequence in .bash_history.

Question

Examine the command sequence in .bash_history. What reconnaissance steps did the attacker perform before exfiltrating data? Why is history -c at the end significant, and why did it fail to destroy the evidence?

Question

The auth.log shows three consecutive sessions at 3 a.m. from the same IP. What does this pattern suggest about the attacker’s intent and level of access? What is forensically significant about the sudo command executed on March 5?

Question

The /etc/passwd entry for mvalencia includes the note GRADUATED 2023. Cross-reference this with the audit_report_q4_2025.txt and meeting_2026-02-28.txt files you found in the image. What organizational failure made this intrusion possible, and what control would have prevented it?

Question

What types of files were recovered from the image? Does the presence of deleted files suggest intentional deletion or normal usage? What filesystem metadata supports your conclusion?

Part 3: Analysis with Autopsy

Kali ships an outdated version of Autopsy. Install the current release from source.

Install Sleuth Kit 4.15.0 (required by Autopsy 4.23.0):

wget https://github.com/sleuthkit/sleuthkit/releases/download/sleuthkit-4.15.0/sleuthkit-4.15.0.tar.gz
tar xzf sleuthkit-4.15.0.tar.gz
cd sleuthkit-4.15.0
./configure
make
sudo make install
cd ..

Set JAVA_HOME permanently:

echo 'export JAVA_HOME=$(readlink -f $(which java) | sed "s|/bin/java||")' >> ~/.zshrc
source ~/.zshrc

Install Autopsy 4.23.0:

wget https://github.com/sleuthkit/autopsy/releases/download/autopsy-4.23.0/autopsy-4.23.0.zip
unzip autopsy-4.23.0.zip
cd autopsy-4.23.0
sudo -E bash unix_setup.sh

Launch Autopsy:

./bin/autopsy --jdkhome $JAVA_HOME

Create a new case:

  1. On the welcome screen, click New Case.
  2. Enter a case name (e.g. lab13) and choose a base directory. Click Next.
  3. Optionally fill in case number and examiner name. Click Finish.

Add the disk image as a data source:

  1. The Add Data Source dialog opens automatically. Select Disk Image or VM File and click Next.
  2. Click Add and browse to copia.dd. Click Next.
  3. Leave the default ingest modules selected and click Finish. Autopsy will index the image in the background.

Investigate:

  1. Navigate the Timeline view to see file activity over time.
  2. Use the File Views panel to browse deleted files.
  3. Search for suspicious activity: recently accessed files, created or deleted entries, and keyword hits.
  4. Use the Keyword Search module to look for terms like credentials, mvalencia, or 198.51.100.42.

Question

Compare the Autopsy timeline with your manual findings from Part 2. Does the graphical view reveal any activity or ordering of events that was harder to see with command-line tools? What specific advantage does timeline visualization offer in a complex investigation?

Submission

  • Hashes and verification of the original image
  • List and analysis of recovered files
  • Key fragments (strings, metadata, suspicious names)
  • Screenshots and key findings
  • Incident hypothesis (1-2 paragraphs): what happened, who may have done it, what evidence supports the hypothesis

Key concepts

TermDefinition
Digital forensicsDiscipline that identifies, preserves, analyzes, and presents digital evidence
Forensic imageBit-for-bit copy of a device for analysis without altering the original
Chain of custodyDocumented record of actions performed on digital evidence
AutopsyGraphical forensic analysis tool based on Sleuth Kit
HashFunction for verifying file integrity of evidence

Navigation:Previous | Home | Next