Using Grover’s Algorithm (Qiskit 2.3.0)
Classical cryptography relies on the computational hardness of key search.
While modern cryptosystems are designed to resist both classical and quantum attacks,
historical ciphers such as the Vigenère cipher provide an excellent testbed for
demonstrating quantum algorithmic advantage in structured search problems.
This repository presents a hybrid classical–quantum cryptanalysis pipeline that
recovers the secret key used to encrypt a file containing 100 Vigenère-encrypted words.
The approach combines classical cryptanalysis for keyspace reduction with
Grover’s quantum search algorithm for accelerated key discovery.
Given:
- A file containing 100 words encrypted using the Vigenère cipher
- The encryption key is unknown
- Each encrypted word appears on a new line
Objective:
- Analyze the ciphertext to infer likely key lengths and candidates
- Reduce the exponential keyspace using classical cryptanalytic techniques
- Apply Grover’s algorithm to search the remaining candidate space
- Demonstrate quantum advantage via reduced oracle/query complexity
- Recover the key and decrypt all 100 words
Classical brute-force attacks on the Vigenère cipher scale exponentially with key length:
Even with cryptanalytic heuristics, the final verification step remains a search problem.
Grover’s algorithm offers a provable quadratic speedup, reducing search complexity to:
This project demonstrates:
- How quantum algorithms integrate into real cryptanalytic workflows
- Why hybrid classical–quantum approaches are essential on NISQ-era hardware
- How quantum advantage differs from quantum supremacy in practice
- Kasiski Examination – detects repeating patterns to infer key length
- Friedman Test (Index of Coincidence) – statistical key length estimation
- Frequency Analysis – χ² scoring against English letter distributions
- Keyspace Reduction – filters improbable keys before quantum search
One can scan the ciphertext for repeated sequences of length 3 or more, note the distances between their occurrences, and then take the greatest common divisors of those distances. Those divisors are strong candidates for the key length because they reflect the periodicity of the repeating key. In practice, you might find several distances like 24, 36, and 60, whose common divisors are 3, 6, and 12, giving you a shortlist of plausible key lengths.
It is a statistical way to estimate key length without relying on repeated patterns. The idea is that English text has a characteristic IC around 0.065, whereas uniformly random text has a much lower IC around 0.038. If you assume a key length 𝑘, you split the ciphertext into 𝑘 columns by taking every k-th letter. Each column should behave like a Caesar cipher of English text, so its IC should be close to the English value if your guess for 𝑘 is correct. By computing IC for different candidate lengths and comparing them to the expected English IC, you can estimate the most likely key length. This method is especially useful when the ciphertext is long and pattern repetition is not obvious.
It is used to recover each key character. For a given position in the key, you take the corresponding column of ciphertext letters and treat it as a Caesar-shifted version of English text. For each possible shift from 0 to 25, you “decrypt” that column and compute a χ² (chi-squared) score comparing the observed letter frequencies with known English letter frequencies. The shift that minimizes the χ² score is the most likely key letter for that position. Doing this independently for each column gives you the full key.
is the idea of using all the above classical insights to drastically reduce the number of candidate keys before doing any exhaustive or quantum search. Instead of searching all
$
\mathcal26^k
$ possible keys, you keep only those key lengths supported by Kasiski and Friedman, and for each key position you keep only the few shifts that have good χ² scores. This shrinks the search space from astronomically large to something manageable
- Grover’s Search Algorithm
- Phase Oracle Construction
- Diffuser (Inversion About the Mean)
- Amplitude Amplification
- Query Complexity Analysis
flowchart TD
A[Ciphertext File<br/>100 Encrypted Words] --> B[Kasiski Examination]
B --> C[Friedman Test]
C --> D[Reduced Key Lengths]
D --> E[Classical Frequency Scoring]
E --> F[Candidate Keyspace]
F --> G[Grover Oracle Construction]
G --> H[Grover Iterations + Diffuser]
H --> I[Key Identified]
I --> J[Decrypt All 100 Words]
Shor’s algorithm is designed for problems with strong algebraic structure, such as integer factorization and discrete logarithms, which underpin public-key cryptosystems like RSA and ECC.
The Vigenère cipher key-recovery problem does not possess such structure. Instead, it reduces to a search problem with an efficiently verifiable solution:
- A candidate key can be tested by decrypting the ciphertext
- The correctness of the key can be verified using statistical language metrics
- No known algebraic shortcut exists for directly computing the key
Grover’s algorithm is therefore the theoretically optimal quantum algorithm for this task, offering a quadratic speedup over classical search:
This makes Grover’s algorithm the natural choice for accelerating Vigenère cryptanalysis in the NISQ era.
We define an English-likeness score based on n-gram statistics.
-
$\mathcal{P} = (p_0, p_1, \dots, p_{N-1})$ is the decrypted plaintext. -
$n$ is the size of the n-gram (e.g., 2 = bigram, 3 = trigram). -
$\Pr(p_i p_{i+1} \dots p_{i+n-1})$ is the probability of that n-gram in English. - The sum aggregates log-likelihood across the entire message.
Higher score
Given a key
-
$\mathcal{C} = (c_0, c_1, \dots, c_{N-1})$ is the ciphertext. -
$\mathcal{K} = (k_0, k_1, \dots, k_{m-1})$ is the key. -
$m$ is the key length. -
$p_i$ is the decrypted plaintext character.
We define the decryption operator:
We define a Boolean predicate that decides whether a key produces valid English:
-
$\mathcal{S}(\cdot)$ is the n-gram score. -
$\tau$ is a chosen threshold. - Output is 1 for a correct key, 0 otherwise.
The Grover oracle flips the phase of valid keys:
- If key
$\mathcal{K}$ is valid$\rightarrow$ phase flip$(-1)$ . - If invalid
$\rightarrow$ state unchanged.
Substituting the decision function:
For each candidate key
-
Decrypt ciphertext:
$\mathcal{P}^{(\mathcal{K})} = \mathcal{D}_{\mathcal{K}}(\mathcal{C})$ -
Compute English score:
$s = \mathcal{S}(\mathcal{P}^{(\mathcal{K})})$ -
Compare with threshold
$\tau$ . -
Apply phase flip if valid.
For implementable circuits, we often replace the score with specific pattern detection:
Then the oracle becomes:
Grover’s algorithm can be understood geometrically as a sequence of rotations in a two-dimensional subspace spanned by:
-
$|w\rangle$ : the superposition of valid keys -
$|r\rangle$ : the superposition of invalid keys
Starting from a uniform superposition
- A phase oracle, which flips the phase of valid states
- A diffuser (inversion about the mean), which amplifies their amplitudes
After
where:
Here,
| Approach | Search Complexity |
|---|---|
| Classical brute-force | |
| Classical cryptanalysis + verification | |
| Grover’s algorithm |
Here,
In practical cryptanalysis, the dominant cost is key verification, not key generation. This makes oracle-call complexity the correct metric for comparison.
| Method | Verification Calls |
|---|---|
| Classical verification | |
| Grover’s search |
This reduction in verification cost constitutes the demonstrated quantum advantage.
- Python 3.9 or higher
- Qiskit 2.3.0
- NumPy
- SciPy
- Matplotlib
Install all dependencies using:
pip install -r requirements.txtPrashik N Somkuwar