Cryptography

Objectives: By the end of this topic, you will be able to…

Apply encryption and decryption techniques with available tools

Verify file integrity using hashes

Understand differences between symmetric and asymmetric encryption

Use public and private keys in a practical and secure manner

What is cryptography?

Cryptography is the discipline that studies techniques to protect information, ensuring its confidentiality, integrity, authenticity, and non-repudiation, even when transmitted over insecure channels.

Through mathematical algorithms, cryptography allows data to be encrypted (making it unreadable to unauthorized parties), integrity to be verified (detecting any alteration in transit), identities to be authenticated, and documents or messages to be digitally signed. It is a fundamental pillar of modern cybersecurity, used in HTTPS, encrypted emails, digital signatures, cryptocurrencies, VPNs, and secure storage.

Classical cryptography (Caesar, Vigenere)

Classical methods serve as the foundation to understand substitution, transposition, and keys.

The Caesar Cipher replaces each letter with another shifted a fixed number of positions — with a shift of 3, A becomes D and B becomes E. Its simplicity is also its weakness: there are only 25 possible shifts, making it trivial to break by brute force.

The Vigenère Cipher improves on Caesar by using a keyword to define a different shift for each position in the plaintext, introducing the concept of a variable-length key. It is more resistant to brute force, but remains vulnerable to frequency analysis when the ciphertext is long enough relative to the key.

Both methods illustrate two important principles: confusion (making the relationship between plaintext and ciphertext difficult to see) and the role of the key as the element that controls encryption and decryption.

Symmetric encryption (AES)

Symmetric encryption uses the same secret key to encrypt and decrypt information. It is fast and efficient for large volumes of data.

AES (Advanced Encryption Standard) is a block cipher that processes data in 128-bit blocks and supports key sizes of 128, 192, or 256 bits. It is the modern standard that replaced DES, operating through multiple rounds of substitutions, permutations, column mixing, and key addition.

Common modes of operation: ECB (Electronic Codebook) is not recommended because identical plaintext blocks produce identical ciphertext blocks, revealing patterns. CBC (Cipher Block Chaining) is more secure — it XORs each block with the previous ciphertext block using a random initialization vector (IV), so identical plaintext produces different ciphertext. GCM (Galois/Counter Mode) goes further, providing both confidentiality and authenticated encryption in a single pass.

Typical uses: file encryption, secure communications (VPN, HTTPS), storage of sensitive data.

Asymmetric cryptography (RSA)

Asymmetric cryptography employs a key pair: one public (for encryption) and one private (for decryption). Based on hard mathematical problems like factoring large integers.

RSA (Rivest-Shamir-Adleman):

Widely used asymmetric algorithm
Security based on the difficulty of factoring large prime numbers
Enables encryption, decryption, and digital signing

Basic operation: RSA generates a key pair — a public key (e, n) used for encryption and a private key (d, n) used for decryption. Encryption computes C = M^e mod n and decryption recovers the original message as M = C^d mod n.

Characteristics: slower than symmetric algorithms, not used for large files directly but to encrypt symmetric keys (as in TLS).

Common uses: establishing secure connections (SSL/TLS), secure key exchange, digital signature and authentication.

Hash functions (MD5, SHA-1, SHA-256)

A hash function takes an input of any length and produces a fixed-length output (hash or digest), representing the “fingerprint” of the original content.

A good hash function has four key properties. It must be deterministic — the same input always produces the same hash. It must be fast to compute, so it can be applied to large files or high-frequency operations efficiently. It must be collision resistant, making it computationally infeasible to find two different inputs that produce the same digest. And it must be preimage resistant, meaning the original message cannot be reconstructed from the hash alone.

The digest length is one of the most visible differences between algorithms. Running the same input through all three shows the output space growing with each generation:

import hashlib
 
data = b"hello"
print(hashlib.md5(data).hexdigest())     # 5d41402abc4b2a76b9719d911017c592
print(hashlib.sha1(data).hexdigest())    # aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d
print(hashlib.sha256(data).hexdigest())  # 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824

MD5 produces 32 hex characters (128 bits), SHA-1 produces 40 (160 bits), and SHA-256 produces 64 (256 bits). The larger the output space, the harder it is to find two inputs that collide.

Common algorithms:

Algorithm	Output	Status
MD5	128 bits	Obsolete, vulnerable to collisions
SHA-1	160 bits	Compromised
SHA-256	256 bits	Currently secure, widely used

Typical uses: file integrity verification, digital signatures, password storage (with salts and key derivation like bcrypt/scrypt/argon2).

Applications

Integrity verification

When a file is created or released, a hash is computed and published alongside it. Any subsequent change to the file — even a single flipped bit — produces a completely different hash, immediately revealing tampering. This mechanism is widely used to verify software downloads, ISO images, and backups.

Digital signature

A digital signature combines asymmetric cryptography and hash functions. The sender hashes the document and encrypts that hash with their private key, producing the signature. The receiver decrypts the signature with the sender’s public key, recomputes the hash independently, and confirms they match — proving both that the content was not altered and that it came from the claimed sender.

Basic obfuscation

Lightweight encryption or hashing is sometimes used to hide sensitive strings in binaries, scripts, or logs — making it harder for automated tools or casual inspection to recognize credentials, keys, or command patterns. This technique is common in malware and in CTF challenges, and recognizing it is an important skill in malware analysis.

Hands-on lab

Requirements: Kali Linux, openssl, gpg, sha256sum, Python 3

Part 1: Data integrity with hash functions

Create a file with known content:

printf "This is a confidential document.\nAuthor: $(whoami)\nDate: $(date)\n" > document.txt
cat document.txt

Compute the SHA-256 hash and save it to a checksum file — this is how software distributors ship verified downloads:

sha256sum document.txt
sha256sum document.txt > document.txt.sha256
cat document.txt.sha256

Verify integrity using the checksum file:

sha256sum -c document.txt.sha256

Compare the three most common hash algorithms side by side. Note the different digest lengths:

md5sum document.txt
sha1sum document.txt
sha256sum document.txt

The outputs are 32 hex characters (128 bits) for MD5, 40 for SHA-1, and 64 for SHA-256. Longer digests mean a larger output space, making collisions exponentially harder to find.

Demonstrate the avalanche effect: change a single character and recompute all three:

sed 's/confidential/Confidential/' document.txt > document_modified.txt
md5sum document.txt document_modified.txt
sha1sum document.txt document_modified.txt
sha256sum document.txt document_modified.txt

Question

How significant was the change in the hash after modifying just one character? What property of hash functions does this demonstrate?

Simulate a tampered download: append a line to the original file, then run checksum verification again:

echo "Injected malicious line." >> document.txt
sha256sum -c document.txt.sha256

Record the exact error message. This is the detection mechanism that package managers like apt use to catch corrupted or tampered packages.

Part 2: Symmetric encryption with AES and openssl

Create a file with highly repetitive, structured content — this will make the difference between CBC and ECB visible later:

python3 -c "print('SECRET: password=hunter2\n' * 20)" > secret.txt
cat secret.txt

Encrypt with AES-256-CBC. The -pbkdf2 flag uses a modern key derivation function to derive the encryption key from the password:

openssl enc -aes-256-cbc -salt -pbkdf2 -in secret.txt -out secret_cbc.enc

Inspect the encrypted file — it should be unreadable binary. Look at the raw bytes:

file secret_cbc.enc
xxd secret_cbc.enc | head -8

Encrypt the same file a second time with the same password and compare the two outputs:

openssl enc -aes-256-cbc -salt -pbkdf2 -in secret.txt -out secret_cbc2.enc
sha256sum secret_cbc.enc secret_cbc2.enc

Are the two CBC-encrypted files identical? Why or why not? What role does the random IV (initialization vector) play in each encryption run?

Decrypt the first file and verify it is a perfect copy of the original:

openssl enc -aes-256-cbc -d -pbkdf2 -in secret_cbc.enc -out secret_decrypted.txt
sha256sum secret.txt secret_decrypted.txt
diff secret.txt secret_decrypted.txt

Try decrypting with the wrong password and observe the result.
Now encrypt the same file using ECB mode and run the same experiment:

openssl enc -aes-256-ecb -salt -pbkdf2 -in secret.txt -out secret_ecb.enc
openssl enc -aes-256-ecb -salt -pbkdf2 -in secret.txt -out secret_ecb2.enc
sha256sum secret_ecb.enc secret_ecb2.enc

Use xxd to examine both outputs side by side and look for repeating 16-byte block patterns:

xxd secret_cbc.enc | head -20
xxd secret_ecb.enc | head -20

Question

What behavioral difference did you observe when switching from CBC to ECB mode? Why is ECB considered insecure for encrypting structured or repetitive data?

Part 3: Asymmetric encryption with GPG

Generate a key pair. When prompted, choose RSA and RSA with a 4096-bit key size:

gpg --full-generate-key

List your keyring to confirm the key was created. Record the key ID and fingerprint:

gpg --list-keys
gpg --fingerprint "your name"

Export your public key to share with your partner:

gpg --export -a "your name" > yourname.pub
cat yourname.pub

Import your partner’s public key into your keyring:

gpg --import partnername.pub
gpg --list-keys

Verify the imported fingerprint by reading it out loud to your partner or comparing screens directly — not through the same channel used to share the file. This step prevents a man-in-the-middle attack where someone swaps the public key in transit:

gpg --fingerprint "partner's name"

Create a message, encrypt it for your partner, and sign it with your private key so they can confirm it came from you:

echo "This is a secret, authenticated message." > message.txt
gpg -se -r "partner's name" message.txt

This produces message.txt.gpg — encrypted with your partner’s public key and signed with your private key.

Decrypt and verify the signature of the message your partner sent you:

gpg -d message.txt.gpg

GPG automatically verifies the signature and reports whether it is valid. A Good signature message means the content was not tampered with and came from the expected sender.

Question

What would happen if someone intercepted the encrypted message but did not have the recipient’s private key? What property of asymmetric encryption ensures the message remains confidential?

Part 4: Classical cryptography — Vigenère

Step 1: Implement the cipher

Write vigenere.py in Python with three functions: encrypt(plaintext, key), decrypt(ciphertext, key), and a main block that reads mode, text, and key from command-line arguments.

Algorithm:

Normalize text and key to uppercase; ignore non-alphabetic characters in the key
For each alphabetic character in the text, shift it by the corresponding key character’s value (A=0, B=1, ..., Z=25), cycling through the key
Pass non-alphabetic characters through unchanged (spaces, punctuation stay in place)

Expected behavior:

$ python3 vigenere.py encrypt "Hello, World!" KEY
Rijvs, Ambpb!

$ python3 vigenere.py decrypt "Rijvs, Ambpb!" KEY
Hello, World!

Verify that decrypt(encrypt(text, key), key) returns the original text for at least 3 different keys and messages of varying length.

Step 2: Encrypt a long text for your partner

Choose a paragraph of at least 300 alphabetic characters of English text (a Wikipedia introduction, a news article excerpt, etc.). Choose a key of 4–8 letters that you keep secret from your partner.

python3 vigenere.py encrypt "$(cat plaintext.txt)" YOURKEY > ciphertext.txt

Exchange ciphertext.txt with your partner — but not the key.

Step 3: Implement the cracker

Write crack_vigenere.py to break your partner’s ciphertext. The attack works in two stages.

Stage 1 — Key length estimation with the Index of Coincidence (IoC)

The IoC measures how unevenly distributed the letters in a text are. For a string with N letters and letter counts f_i (for each of the 26 letters):

IoC = Σ f_i × (f_i − 1) / (N × (N − 1))

English plaintext has IoC ≈ 0.065. Random or well-mixed ciphertext has IoC ≈ 0.038. The key insight: if you split a Vigenère ciphertext into k groups by stride (group 0 = positions 0, k, 2k, …; group 1 = positions 1, k+1, 2k+1, …; etc.), and k matches the true key length, each group becomes a simple Caesar cipher — and its IoC will be close to 0.065.

For each candidate key length from 1 to 20:

Split the ciphertext (letters only) into k groups by stride
Compute the IoC of each group using the formula above
Average the IoCs across all k groups
Print the candidate length and its average IoC

The length whose average IoC is highest and closest to 0.065 is your best guess for the key length.

Stage 2 — Recover each key letter with frequency analysis

Once you have the key length k:

Split the ciphertext into k groups using the same stride method
For each group, count how often each of the 26 letters appears
Find the most frequent letter in the group. Assume it decrypts to E — the most common letter in English. The key letter for that position is then (index_of_most_frequent − 4) % 26, where A=0, B=1, …, E=4, …
Build the full candidate key from all k recovered letters

Decrypt the ciphertext with your candidate key using vigenere.py. If the output is readable English, you have broken the cipher. If a few words look wrong, try swapping one key letter at a time with the second or third most frequent letter in that group — short texts do not always have E as the most frequent letter in every column.

Question

At what key length does the IoC method start to require much longer ciphertexts to work reliably? What is the theoretical upper limit of Vigenère security, and what cipher design eventually solved it?

Submission

Compressed folder containing:

Screenshots of each major step: sha256sum -c verification, tampered-file detection, xxd inspection, AES encrypt/decrypt, GPG encrypt/decrypt
document.txt, document_modified.txt, and both .sha256 checksum files
secret_cbc.enc, secret_ecb.enc, and secret_decrypted.txt with sha256sum comparison output
yourname.pub, message.txt.gpg, and the decryption output showing the signature verification
vigenere.py and crack_vigenere.py with usage examples and sample output
plaintext.txt, ciphertext.txt (the one you sent your partner), and the decrypted result of your partner’s ciphertext
Short document (1–2 pages) explaining: what the avalanche effect showed, why the CBC outputs differed between runs, what the Good signature message in GPG means, and what key length your cracker recovered and how many columns needed a second-guess correction

Key concepts

Term	Definition
AES	Standard symmetric encryption algorithm with 128-bit blocks
Symmetric encryption	System that uses the same key to encrypt and decrypt
Asymmetric encryption	System that uses a key pair: public and private
RSA	Asymmetric algorithm based on prime number factorization
SHA-256	256-bit hash function, currently secure and widely used
Hash	Function that converts data into a fixed-length string
GPG	Free implementation of OpenPGP for encryption and digital signatures

Navigation: ← Previous | Home | Next →

Quartz 4

Explorer

06 - Cryptography

Cryptography

What is cryptography?

Classical cryptography (Caesar, Vigenere)

Symmetric encryption (AES)

Asymmetric cryptography (RSA)

Hash functions (MD5, SHA-1, SHA-256)

Applications

Integrity verification

Digital signature

Basic obfuscation

Hands-on lab

Part 1: Data integrity with hash functions

Part 2: Symmetric encryption with AES and openssl

Part 3: Asymmetric encryption with GPG

Part 4: Classical cryptography — Vigenère

Step 1: Implement the cipher

Step 2: Encrypt a long text for your partner

Step 3: Implement the cracker

Submission

Key concepts

Graph View

Table of Contents

Backlinks