OSINT

Objectives: By the end of this topic, you will be able to…

  • Collect publicly available information useful for a security assessment
  • Use OSINT tools included in Kali Linux
  • Recognize the legal and ethical limits of using these tools

What is OSINT

OSINT (Open Source Intelligence) refers to the process of collecting and analyzing publicly available information to obtain useful intelligence.

OSINT draws from accessible sources — search engines like Google, Bing, and Yandex; social media platforms such as LinkedIn, Facebook, and Twitter; public records including government databases and WHOIS registries; metadata embedded in documents and images; forums, blogs, and paste sites like Pastebin and Reddit; and exposed domains and web services. It does not require invasive techniques, and it is legal as long as it respects applicable privacy policies and terms of service. Security professionals use it in defensive contexts such as audits and threat hunting, while adversaries apply the same techniques for red teaming and criminal reconnaissance.


Importance of OSINT in cybersecurity

OSINT is the first phase of an attack chain (Kill Chain), specifically in the reconnaissance stage.

Passive vs active reconnaissance

Passive reconnaissance collects information without directly interacting with the target, making it stealthier and harder to detect; active reconnaissance involves direct interaction with the target’s infrastructure or services, which carries a higher risk of triggering alerts. Passive techniques carry low detection risk while yielding a large volume of information — enough to build a detailed profile of a person or organization before any direct contact is made.

Attackers leverage OSINT to select vulnerable targets and craft convincing social engineering lures, while security teams use the same techniques to identify unintentional public exposures — shadow IT, data leaks, or poor staff practices that an adversary could exploit first.


OSINT tools in Kali Linux

ToolDescription
theHarvesterCollects email addresses, usernames, hosts, and subdomains
Maltego CEVisualizes relationships between entities such as people, domains, IPs
whoisDisplays domain owner information
digQueries DNS records for a domain
dnsreconAutomates DNS information gathering
ExifToolExtracts metadata from images and documents
Google DorksAdvanced use of search engine operators to find sensitive data

Although these tools automate processes, human analysis remains key for interpreting results.


OSINT techniques by target type

Gathering on people

Goal: obtain data to identify, locate, or profile a person.

Information sought: social media profiles, email addresses, resumes, photographs with metadata, forum participation.

Common techniques include advanced searches using Google Dorks, queries on services like Hunter.io and HaveIBeenPwned to surface email addresses and leaked credentials, analysis of image metadata with tools like ExifTool, and reverse image searches on Google or Yandex to trace a photograph back to its origin.

Gathering on infrastructure

Goal: understand an organization’s digital infrastructure and its public exposure.

Typical data: WHOIS information, DNS and subdomains, IP addresses, public emails, exposed services.

For infrastructure, analysts run WHOIS and DNS queries with dig and dnsrecon, enumerate email addresses and subdomains with theHarvester, map relationships visually in Maltego, and search for exposed services and devices using Shodan or Censys.


Although OSINT is based on public sources, not everything accessible is legal to use. Posting something on social media does not imply consent for automated collection, and many platforms explicitly prohibit scraping in their terms of service. In professional audits, written authorization from the client must exist before gathering information about their environment. Throughout any investigation, the privacy and security of third parties must be respected — gathering information for harmful purposes crosses both legal and ethical lines.

In professional cybersecurity, ethics and legality are as important as technical knowledge.


Hands-on lab

Requirements: Kali Linux, internet access, Maltego CE

Part 1: Passive reconnaissance on a domain

Each pair will receive an assigned domain or identity.

  1. Use whois, nslookup, and dig to profile the domain
  2. Run theHarvester to search for emails, hosts, and social media:
theHarvester -d example.com -b google,bing
  1. Optional: use crt.sh, hunter.io, amass, or web OSINT tools

? What email addresses, subdomains, or hosts did theHarvester surface? Were any results unexpected or particularly sensitive? How could an attacker leverage this information against the target?

Part 2: Metadata analysis

Files (PDFs, images, DOCX) with embedded metadata will be provided.

  1. Analyze with exiftool and online tools
  2. Identify authorship, timestamps, software, location
  3. Extract coordinates and plot them on a map

? What does the metadata reveal about the document’s origin and history? Could any of these details — author name, software version, or GPS coordinates — be used to profile or target a specific individual?

Part 3: Visualization in Maltego CE

  1. Create entities and use basic transformations
  2. Identify non-obvious connections

? What relationships did Maltego surface that you would not have found through manual searching? What security risk do these connections represent for the organization?

  1. Export the graph image for inclusion in the report

Submission

Report per pair (max. 3 pages plus images):

  • Techniques and tools used
  • Main findings per section
  • Maltego visualization
  • Critical analysis of risks, ethics, and privacy
  • Brief personal reflection (one per student)

Additional resources: OSINT Cheatsheet | Report Template


Key concepts

TermDefinition
OSINTProcess of collecting and analyzing publicly available information
theHarvesterOSINT tool that collects emails, subdomains, and hosts
Google DorksAdvanced search techniques for finding exposed information
Kill ChainModel describing the phases of a cyberattack, starting with reconnaissance
SniffingCapture of traffic or information on a network or public source

Navigation:Previous | Home | Next