A line-by-line code review of quantum circuits across Qiskit (IBM), PennyLane (Xanadu), and Cirq (Google) — the bugs, the patterns, and what it taught me about quantum code quality.

The Most-Reproduced Quantum Chemistry Calculation Has a Bug

Let me start with the finding that surprised me most.

The hydrogen molecule (H₂) VQE circuit is probably the most-reproduced quantum chemistry calculation in existence. It's in every textbook, every tutorial, every framework's getting-started guide.

Mine had a bug.

The qubit Hamiltonian — the standard five-term parity-mapped form (II, IZ, ZI, ZZ, XX) with 2-qubit reduction on an STO-3G basis set — included a two-body interaction term with an incorrect coefficient. The affected coefficient was small enough that the optimized energy still looked plausible, so the error never surfaced in testing. O'Malley et al. (2016) use a related but different Hamiltonian form (Bravyi-Kitaev with both XX and YY terms); my bug was in the parity-reduced variant commonly used in Qiskit tutorials.

I only caught it by re-deriving the Hamiltonian from second quantization and comparing term-by-term.

Takeaway: "It gives the right answer" is not the same as "it's correct." When the error is in a small coefficient, the optimized energy can look plausible even though the Hamiltonian is wrong — making these bugs incredibly hard to catch through testing alone.

This was one of six findings from auditing every circuit I'd built. Here's the full story.

Why I Did This

Last year I started building QubitHub — a free tool I use while learning quantum computing. I'd put together 50 circuits covering everything from Bell states to VQE to Grover's search, across three frameworks: Qiskit (IBM), PennyLane (Xanadu), and Cirq (Google).

Most ran. Most produced output. But did they actually work correctly?

I didn't know. And that bothered me.

So I did something that felt slightly obsessive: I audited every single circuit. Not just "does it run" — but "is every gate correct, is every parameter justified, and could someone else reproduce why this circuit does what it claims?"

The audit took several weeks of evenings and weekends — five phases, from basic execution verification to cross-cutting consistency checks. All circuits were tested on simulators (Qiskit Aer and framework-native backends), not real hardware.

The Library

50 deliberately curated circuits — not scraped from tutorials — across 11 categories: algorithms (Deutsch-Jozsa, Grover's, Phase Estimation), entanglement and communication (Bell State, Teleportation), quantum chemistry (VQE variants for H₂ through BeH₂), machine learning (variational classifiers, kernel methods), optimization (QAOA for MaxCut, TSP, Portfolio), error correction (bit-flip through Surface Code), variational ansatze, and research reproductions of landmark papers.

37 in Qiskit, 7 in PennyLane, 6 in Cirq.

Finding #1: Dependency Updates Break Circuits

The first thing I did was run every circuit. 46 out of 50 passed. The 4 that failed were all PennyLane circuits, but for two different reasons. Three failed because a NumPy update changed array creation semantics — PennyLane's autograd ArrayBox wrapper was rejected with a confusing "invalid array creation" error. The fix was to use PennyLane's own pnp.stack and pnp.mean instead of bare NumPy functions.

The fourth — a data re-uploading classifier — timed out during training. It used manual finite differences (~96K QNode calls) where the paper used analytic gradients. The fix was switching from manual finite differences to PennyLane's built-in gradient optimizer, cutting the QNode workload dramatically.

Takeaway: Framework version pinning matters enormously in quantum computing. A circuit that works on Monday can break on Tuesday because a dependency updated. This is exactly the kind of problem I keep running into while learning — and why I started putting circuits on QubitHub where I could track what worked with which versions.

Finding #2: The VQE Hamiltonian Bug

(Described above — the most surprising finding.)

Finding #3: Ordering Conventions Cause Subtle Bugs

Qiskit uses little-endian qubit ordering — qubit 0 is the rightmost bit in a measurement string. Different frameworks handle ordering differently (Cirq, for example, uses explicit qubit_order parameters rather than a fixed convention). If you port code between frameworks without accounting for this, results silently change.

I found this in two places:

Phase Estimation: The inverse QFT rotation order was reversed — the phase was read out in backwards bit order because the implementation assumed big-endian convention.
Error Correction (3-qubit bit-flip code): The syndrome bit interpretation was inverted — 01 was decoded as "error on qubit 0" when it should have identified qubit 2.

Both circuits ran fine. Both produced output. Both were wrong.

Takeaway: Framework ordering conventions are a common source of subtle quantum bugs. Cross-framework comparison is a powerful debugging tool — if your Qiskit and Cirq implementations give different answers, at least one has a convention issue.

Finding #4: Qubit Count Is a Terrible Difficulty Metric

I'd originally labeled circuits as "beginner," "intermediate," or "advanced" based primarily on qubit count. Grover's search uses only 2-3 qubits, so it was labeled "beginner."

That's wrong. Grover's requires understanding phase kickback, amplitude amplification, and constructive interference — concepts that take weeks to build intuition for. Meanwhile, a 10-qubit GHZ state is conceptually simple (just a chain of CNOTs after a Hadamard).

After the audit, I recalibrated every difficulty label based on conceptual complexity:

Bell State, GHZ, Teleportation → genuinely beginner
Deutsch-Jozsa, Bernstein-Vazirani → beginner (simple oracle structure)
Grover's, QFT, Phase Estimation → intermediate (deep conceptual requirements)
VQE, QAOA, QML circuits → intermediate to advanced

Takeaway: If you're building a learning path for quantum computing, don't sort by qubit count. Sort by the prerequisite concepts.

Finding #5: Research Reproductions and Paper-Derived Circuits Had Drifted

Five circuits were reproductions of landmark papers — Peruzzo et al. 2014 (the original VQE), Farhi et al. 2014 (the original QAOA), Pérez-Salinas et al. 2020 (data re-uploading), and others. In the neighboring variational ansatz set, I found similar drift.

The most concerning case: a UCCSD ansatz circuit used hardcoded double-excitation indices that only worked for 4 qubits, despite the paper describing a general-purpose method. In the research reproductions, one had substituted a simplified gradient computation (manual finite differences) where the paper used analytic gradients.

Neither was documented. A researcher forking these circuits would have inherited silent approximations without knowing they diverged from the original paper.

I added a [reproducibility] section to each circuit's manifest, explicitly documenting: which paper or result, what parameters, and where the implementation diverges.

Takeaway: Reproducibility requires more than sharing code. It requires documenting exactly which version of which paper your circuit implements, and being honest about where you've approximated.

Each circuit has prerequisites and next_circuits fields forming a learning path. I ran a validation script and found 22 errors across 17 circuits:

Phantom prerequisites referencing circuits that don't exist
Circular dependencies (A requires B, B requires A)
Dead-end circuits with no "what's next" pointer

Individual circuits without context are just code snippets. Connected learning paths turn them into a curriculum. After fixing the links, the library has four validated progression tracks — from basic VQE to complex molecular simulations, from simple QAOA to multi-constraint optimization.

Takeaway: A circuit library is only as useful as its navigation.

The Methodology

If you want to audit your own quantum circuits, here's the framework I settled on:

Code check (per circuit):

Does it execute without errors on a simulator?
Are rotation angles expressed as π/N with comments? (no magic numbers)
Is the qubit count parameterized, not hardcoded?
Are there gate-by-gate comments explaining why each gate is there?
Is there a verification function that checks the output?

Documentation check:

Four-layer explanation: Intuition → How it works → Math → Implementation
Expected output in plain language
"What's Next" links to related circuits

Cross-reference check:

Verify every paper citation (title, authors, DOI)
Compare implementations against independent sources
Check external links still resolve

What Changed

Before vs. after:

50/50 circuits execute correctly (was 46/50)
50/50 have verified outputs (was ~30/50)
50/50 have four-layer READMEs (was ~20/50)
4 validated learning paths with 0 broken links (was 22 errors)
Standardized gate names across frameworks (e.g., CX instead of CNOT — same gate, consistent naming)

I'm much more confident in each one now — but I'd still welcome expert eyes.

Try Them

Every circuit from this audit is on QubitHub — you can browse the code, run them in the browser, or fork and modify.

→ qubithub.co

I'm still learning quantum computing. Every week I re-derive a circuit gate by gate, and every gap I find as a learner becomes a feature in the tool.

If you're learning too — or if you're an expert who spots something I missed — I'd love to hear from you. DMs open on X.

Nandan writes about learning quantum computing at qubithub.co/blog.

I Audited 50 Quantum Circuits. Here's What I Found.