Performance Benchmarks

Post-quantum cryptography at production-ready speeds - test it yourself in 5 minutes

🚀 Run benchmarks in your environment

Quick Start: Benchmark AnkaSecure (5 minutes)

Estimated time: 5 minutes What you'll measure: Encryption/decryption speed for ML-KEM vs RSA Requirements: AnkaSecure API access or trial installation

Test ML-KEM Performance

# Benchmark 100 ML-KEM-1024 encryptions (1KB payload)
time for i in {1..100}; do
  curl -X POST https://api.ankatech.co/encrypt \
    -H "Authorization: Bearer $TOKEN" \
    -d '{"algorithm":"ML_KEM_1024","plaintext":"'$(head -c 1024 /dev/urandom | base64)'"}'
done

✅ Expected result: ~3ms per encryption (300ms total for 100 operations)

Throughput: ~330 operations/second per CPU core

Compare with RSA

# Benchmark 100 RSA-4096 encryptions (same 1KB payload)
time for i in {1..100}; do
  curl -X POST https://api.ankatech.co/encrypt \
    -H "Authorization: Bearer $TOKEN" \
    -d '{"algorithm":"RSA_4096","plaintext":"'$(head -c 1024 /dev/urandom | base64)'"}'
done

✅ Expected result: ~3ms per encryption (300ms total)

Performance: ML-KEM ≈ RSA for small payloads (encryption)

But decryption (where PQC shines): - RSA-4096 decrypt: 11ms (1KB payload) - ML-KEM-1024 decrypt: 7ms (1KB payload)

ML-KEM is 36% faster for decryption (critical for API responses!)

🎯 Insight: Use ML-KEM for request/response APIs where decrypt performance matters

Performance Summary

Encryption Speed (1KB Payload)

Algorithm	Encrypt	Decrypt	Throughput	Use Case
AES-256-GCM	3ms	5ms	333 ops/sec	Symmetric (fastest)
ML-KEM-1024	3ms	7ms	143 ops/sec	PQC (recommended)
ML-KEM-768	3ms	6ms	167 ops/sec	PQC (balance)
ML-KEM-512	4ms	8ms	125 ops/sec	PQC (high-performance)
RSA-4096	3ms	11ms	91 ops/sec	Classical (slower decrypt)
RSA-2048	2ms	7ms	143 ops/sec	Classical (legacy)

Winner for APIs: ML-KEM-1024 (7ms decrypt vs 11ms RSA-4096)

Encryption Speed (5MB Payload)

Algorithm	Encrypt	Decrypt	Throughput	Use Case
AES-256-GCM	71ms	83ms	12 ops/sec	Symmetric (fastest)
RSA-4096	11ms	57ms	17 ops/sec	Classical (fast encrypt)
ML-KEM-1024	61ms	82ms	12 ops/sec	PQC (balanced)
ML-KEM-768	64ms	88ms	11 ops/sec	PQC (high-security)

Winner for large files: RSA-4096 (fastest encrypt), but use streaming APIs for files > 5MB

Recommendation: For files > 5MB, use streaming operations (avoids memory overhead)

Digital Signature Speed

Algorithm	Sign (1KB)	Verify (1KB)	Throughput	Use Case
Ed25519	2ms	3ms	333 ops/sec	Classical (fastest)
ML-DSA-65	4ms	5ms	200 ops/sec	PQC (recommended)
ML-DSA-87	6ms	7ms	143 ops/sec	PQC (high-security)
ECDSA-P256	3ms	4ms	250 ops/sec	Classical (widely supported)
RSA-PSS-3072	5ms	2ms	200 ops/sec	Classical (slow sign, fast verify)
Falcon-1024	8ms	3ms	125 ops/sec	PQC (compact signatures)
SLH-DSA-SHAKE-256f	12ms	4ms	83 ops/sec	PQC (stateless hash-based)

Winner: ML-DSA-65 (best PQC performance, NIST-standardized)

Scalability Benchmarks

Horizontal Scaling

Test: 10,000 concurrent ML-KEM encryptions across multiple nodes

Nodes	Operations/Sec	Latency (p95)	Scaling Efficiency
1 node	1,200 ops/sec	8ms	Baseline
3 nodes	3,500 ops/sec	9ms	97% efficient
6 nodes	6,800 ops/sec	10ms	94% efficient
12 nodes	12,500 ops/sec	12ms	87% efficient

Conclusion: Near-linear scaling up to 12 nodes

Typical deployment: 3-6 nodes for production (balances cost and redundancy)

Throughput by Payload Size

Test: ML-KEM-1024 encryption throughput vs payload size

Payload Size	Encrypt Time	Ops/Sec	MB/Sec Throughput
1 KB	3ms	333	0.33 MB/s
10 KB	3ms	333	3.3 MB/s
100 KB	5ms	200	20 MB/s
1 MB	14ms	71	71 MB/s
5 MB	61ms	16	80 MB/s

Peak throughput: ~80 MB/s per CPU core (large payloads)

Competitive Comparison

AnkaSecure vs AWS KMS

Disclaimer: AWS KMS performance varies by region and load. Data below from public AWS documentation and third-party benchmarks (2025-Q4).

Operation	AnkaSecure (ML-KEM)	AWS KMS (RSA-4096)	AnkaSecure Advantage
Encrypt (1KB)	3ms	~5ms*	40% faster
Decrypt (1KB)	7ms	~15ms*	53% faster
Key generation	12ms	~50ms*	76% faster
Max throughput	12,000 ops/sec (12 nodes)	~2,000 ops/sec**	6× higher

AWS KMS latency includes network overhead (typical cross-AZ call) *AWS KMS throttles at 1,200-5,500 req/sec depending on operation type (per region)

Sources: - AWS KMS request quotas - Third-party benchmarks: AWS KMS Performance Testing (2024)

AnkaSecure vs HashiCorp Vault

Disclaimer: Vault benchmarks from HashiCorp public documentation (v1.15, 2025).

Operation	AnkaSecure (ML-KEM)	Vault (Transit AES-256)	Notes
Encrypt (1KB)	3ms	~4ms	Vault uses AES only, no PQC
Decrypt (1KB)	7ms	~3ms	AES faster (but not quantum-resistant)
PQC support	✅ 34 algorithms	❌ None (roadmap unclear)	AnkaSecure only PQC option
Multi-tenancy	✅ Native	⚠️ Via namespaces	AnkaSecure designed for multi-tenant

Vault advantage: Faster AES operations (but no PQC support)

AnkaSecure advantage: Quantum resistance + multi-tenant SaaS

Use case fit: - Vault: Single-tenant, infrastructure secrets (passwords, tokens) - AnkaSecure: Multi-tenant, data encryption (documents, databases, APIs)

Real-World Performance

Case Study 1: High-Frequency Trading Firm

Workload: 10,000 encryptions/second (market data feed)

Configuration: - 6-node AnkaSecure cluster - Algorithm: ML-KEM-768 (balance of speed + security) - Payload: 512 bytes per message

Results: - Throughput: 9,800 ops/sec sustained (98% of target) - Latency: p95 = 8ms, p99 = 12ms - CPU usage: 60% average (headroom for spikes)

Conclusion: AnkaSecure handles high-frequency workloads (< 10ms p95)

Case Study 2: Healthcare SaaS (Document Encryption)

Workload: 50,000 patient records/day (encrypted at rest)

Configuration: - 3-node AnkaSecure cluster - Algorithm: RSA-4096 + ML-KEM-1024 (composite keys) - Payload: 50KB per record (average)

Results: - Throughput: 2,100 records/hour (sufficient for 50K/day) - Latency: p95 = 45ms (encrypt), p95 = 52ms (decrypt) - Storage overhead: +12% (JWE envelope + composite keys)

Conclusion: Composite keys acceptable for batch processing (non-interactive)

Case Study 3: Government Cloud (Classified Data)

Workload: 1,000 document signatures/hour (ML-DSA-87)

Configuration: - Single-node (air-gapped network) - Algorithm: ML-DSA-87 (NIST Level 5) - Payload: 1MB per document

Results: - Sign time: p95 = 35ms - Verify time: p95 = 28ms - Throughput: 3,000 signatures/hour (3× required)

Conclusion: ML-DSA exceeds government workload requirements

Performance Optimization Tips

Tip 1: Choose Algorithm by Workload

Interactive APIs (low latency critical): - ✅ Use ML-KEM-768 (6ms decrypt for 1KB) or ML-KEM-512 (8ms) - ❌ Avoid composite keys (21ms decrypt) unless security mandates

Batch processing (throughput critical): - ✅ Use composite keys (defense-in-depth) - latency less critical - ✅ Use ML-KEM-1024 (maximum security)

Tip 2: Enable Connection Pooling

Without pooling (new HTTPS connection per request):

Latency = Crypto time (3ms) + TLS handshake (50ms) + Network (5ms) = 58ms

With pooling (reuse connections):

Latency = Crypto time (3ms) + Network (5ms) = 8ms

Improvement: 7× faster by eliminating TLS handshake overhead

SDK configuration:

AnkaSecureConfig config = AnkaSecureConfig.builder()
    .connectionPoolSize(50)  // Reuse up to 50 connections
    .keepAliveTimeout(Duration.ofMinutes(5))
    .build();

Tip 3: Use Caching for Repeated Decryptions

Scenario: Same encrypted file decrypted 100 times (e.g., serving static content)

Without caching:

100 decryptions × 7ms = 700ms total

With caching (decrypt once, cache plaintext):

1 decryption (7ms) + 99 cache hits (0.1ms each) = 17ms total

Improvement: 41× faster

SDK configuration:

AnkaSecureConfig config = AnkaSecureConfig.builder()
    .enableCache(true)
    .cacheTTL(Duration.ofHours(1))  // Cache for 1 hour
    .cacheMaxSize(1000)  // Up to 1000 plaintexts
    .build();

Security note: Only cache non-sensitive data (e.g., public content, configuration files)

Tip 4: Batch Operations for Bulk Processing

Scenario: Encrypt 1,000 files

Serial approach (one API call per file):

1,000 files × (3ms encrypt + 5ms network) = 8,000ms = 8 seconds

Batch approach (100 files per API call):

10 batches × (300ms encrypt + 5ms network) = 3,050ms = 3 seconds

Improvement: 2.6× faster

API example:

curl -X POST https://api.ankatech.co/batch/encrypt \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "keyId": "mlkem-001",
    "plaintexts": [
      "file1_data...",
      "file2_data...",
      ... (up to 100 files)
    ]
  }'

Benchmark Methodology

Test Environment

Hardware: - CPU: Intel Xeon Platinum 8375C @ 2.90 GHz (16 cores) - RAM: 32 GB DDR4 - Disk: NVMe SSD (not a bottleneck for crypto operations) - Network: 10 Gbps (eliminates network as variable)

Software: - OS: Ubuntu 24.04 LTS - JVM: OpenJDK 25 (Hotspot) - JVM Options: -Xmx8g -XX:+UseG1GC - Crypto library: Bouncy Castle 1.81 (FIPS-validated)

Test methodology: - Warm-up: 5 iterations (stabilize JIT compiler) - Measurement: 10 iterations (median value reported) - Deterministic RNG: Fixed seed for reproducibility - Sequential execution: No concurrency effects

Metrics collected: - Wall-clock time: End-to-end latency - CPU time: Actual CPU consumption (excludes I/O wait) - Memory: Heap delta during operation

Payload Sizes Tested

Small payloads (API requests): - 1 KB - Typical API JSON payload - 10 KB - Small documents, metadata - 100 KB - Images, PDFs

Medium payloads (files): - 500 KB - Office documents, reports - 1 MB - High-res images, data exports

Large payloads (bulk data): - 5 MB - Video clips, large PDFs - Note: For > 5MB, use streaming APIs (better performance)

Detailed Benchmark Results

Symmetric Encryption (AES-GCM)

Best performance for bulk data:

Algorithm	Payload	Encrypt	Decrypt	Throughput
AES-256-GCM	1 KB	3ms	5ms	333 ops/sec
AES-256-GCM	100 KB	4ms	5ms	200 ops/sec
AES-256-GCM	1 MB	16ms	19ms	53 ops/sec
AES-256-GCM	5 MB	71ms	83ms	12 ops/sec

Use case: High-volume data encryption (logs, backups, archives)

Throughput at scale (5MB files): - 1 node: ~12 files/sec = 60 MB/sec - 6 nodes: ~72 files/sec = 360 MB/sec - 12 nodes: ~144 files/sec = 720 MB/sec

Post-Quantum KEM (ML-KEM)

NIST-standardized PQC encryption:

Algorithm	Payload	Encrypt	Decrypt	Notes
ML-KEM-512	1 KB	4ms	8ms	NIST Level 1 (AES-128 equivalent)
ML-KEM-768	1 KB	3ms	6ms	NIST Level 3 (AES-192 equivalent)
ML-KEM-1024	1 KB	3ms	7ms	NIST Level 5 (AES-256 equivalent)
ML-KEM-512	5 MB	58ms	89ms
ML-KEM-768	5 MB	64ms	88ms
ML-KEM-1024	5 MB	61ms	82ms	Recommended

Surprise: ML-KEM-1024 (highest security) is actually fastest for 5MB payloads!

Recommendation: Use ML-KEM-1024 by default (best security-performance balance)

Classical RSA (Legacy Comparison)

Algorithm	Payload	Encrypt	Decrypt	Security Level
RSA-2048	1 KB	2ms	7ms	NIST L1 (112-bit)
RSA-3072	1 KB	3ms	8ms	NIST L2 (128-bit)
RSA-4096	1 KB	3ms	11ms	NIST L3 (152-bit)
RSA-2048	5 MB	12ms	57ms
RSA-4096	5 MB	11ms	57ms

ML-KEM advantage: 36% faster decrypt (7ms vs 11ms for RSA-4096, 1KB payload)

Composite Keys (Hybrid PQC)

RSA-4096 + ML-KEM-1024 performance:

Operation	RSA-4096 Alone	ML-KEM-1024 Alone	Composite (Both)	Overhead
Key generation	~5000ms	12ms	~5012ms	+0.2%
Encrypt (1KB)	3ms	3ms	~5ms	+67%
Decrypt (1KB)	11ms	7ms	~13ms	+18%
Encrypt (5MB)	11ms	61ms	~70ms	+15%
Decrypt (5MB)	57ms	82ms	~95ms	+16%

Trade-off: 15-67% overhead for 1000× security improvement (both algorithms must be broken)

When to use: - ✅ High-value data (financial, classified, healthcare) - ✅ Long retention (10+ years) - ✅ Compliance required (NIST SP 800-227, GSA PQC) - ❌ High-frequency trading (latency-critical)

Learn more about composite keys

Performance vs Security Trade-offs

Algorithm Selection Matrix

Use Case	Recommended Algorithm	Latency (1KB decrypt)	Security Level
High-frequency APIs	ML-KEM-768	6ms	NIST L3 (192-bit)
Standard security	ML-KEM-1024	7ms	NIST L5 (256-bit)
Maximum security	RSA-4096 + ML-KEM-1024 composite	13ms	Defense-in-depth
Bulk processing	AES-256-GCM	5ms	Symmetric only (need separate KEK)
Legacy compatibility	RSA-4096	11ms	Classical (not quantum-resistant)

Decision framework: 1. Latency budget: < 10ms → ML-KEM-768, < 20ms → ML-KEM-1024, > 20ms → Composite 2. Security requirement: Standard → ML-KEM-1024, Federal/classified → Composite 3. Data lifetime: < 5 years → ML-KEM, 5-10 years → ML-KEM-1024, > 10 years → Composite

Memory Footprint

Heap usage during encryption (5MB payload):

Algorithm	Encrypt Heap	Decrypt Heap	Total Memory	Notes
AES-256-GCM	74 MB	0 MB	74 MB	Symmetric (low overhead)
ML-KEM-1024	41 MB	0 MB	41 MB	PQC (efficient)
RSA-4096	41 MB	74 MB	115 MB	Classical (higher memory)

Recommendation: For memory-constrained environments, prefer ML-KEM (41 MB) over RSA (115 MB)

Run Your Own Benchmarks

Option 1: Use AnkaSecure Performance Toolkit

Download:

curl -sSL https://ankatech.co/perf-toolkit.tar.gz | tar xz
cd ankasecure-perf-toolkit

Run benchmark:

./benchmark.sh \
  --endpoint https://api.ankatech.co \
  --algorithm ML_KEM_1024 \
  --payload-size 1KB \
  --iterations 100 \
  --token $YOUR_API_TOKEN

✅ Output:

Benchmark Results:
  Algorithm: ML-KEM-1024
  Payload: 1 KB
  Iterations: 100

  Encrypt: median 3ms, p95 4ms, p99 5ms
  Decrypt: median 7ms, p95 9ms, p99 11ms
  Throughput: 125 ops/sec

Customize: - Try different algorithms (RSA-4096, ML-KEM-768, AES-256-GCM) - Vary payload sizes (1KB, 100KB, 1MB) - Test composite keys (--algorithm COMPOSITE_RSA_MLKEM)

Option 2: Quick cURL Test

Simple latency check (no toolkit needed):

# Measure encrypt latency
time curl -X POST https://api.ankatech.co/encrypt \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"algorithm":"ML_KEM_1024","plaintext":"..."}'

# Measure decrypt latency
time curl -X POST https://api.ankatech.co/decrypt \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"keyId":"...","ciphertext":"..."}'

Tip: Run 10 times and take median (ignore first run for JIT warmup)

Option 3: Load Testing with JMeter

Scenario: Test 10,000 concurrent operations

JMeter test plan: Download AnkaSecure-LoadTest.jmx

Steps: 1. Import test plan into JMeter 2. Configure endpoint and authentication 3. Set thread count (1,000 users) 4. Set ramp-up period (10 seconds) 5. Run test (10,000 iterations)

Metrics reported: - Average latency - p90, p95, p99 latencies - Throughput (ops/sec) - Error rate (%)

Performance SLA (SaaS)

AnkaSecure SaaS performance guarantees:

Metric	Standard Tier	Premium Tier	Enterprise Tier
API latency (p95)	< 50ms	< 20ms	< 10ms
Throughput	100 ops/sec	1,000 ops/sec	10,000+ ops/sec
Availability	99.9%	99.95%	99.99%
Concurrent connections	100	1,000	10,000+

Monitoring: Real-time status page at status.ankatech.co

FAQ

How does AnkaSecure compare to hardware HSMs?

Hardware HSMs (Luna, nShield): - Pros: FIPS 140-2 Level 3/4, tamper-resistant - Cons: Low throughput (100-500 ops/sec), expensive ($10K-$50K per device)

AnkaSecure (software + optional HSM): - Pros: High throughput (12,000 ops/sec), cost-effective ($25K/year), scales horizontally - Cons: Software-based (FIPS 140-2 Level 1)

Hybrid approach: Use AnkaSecure for crypto operations + Luna/nShield for key wrapping (best of both)

Can I use GPU acceleration?

Future roadmap: GPU acceleration for ML-KEM planned for v3.1 (Q2 2026)

Expected improvement: 3-5× throughput for ML-KEM operations

Current state: CPU-only (still production-ready for most workloads)

What's the largest payload size supported?

Compact APIs (in-memory): Up to 5 MB per request

Streaming APIs (chunked): Unlimited (tested up to 100 GB files)

Recommendation: - < 5 MB → Use compact APIs (simpler) - > 5 MB → Use streaming APIs (better performance, lower memory)

Learn more about streaming operations

How do I monitor performance in production?

Built-in monitoring (included in all tiers): - Metrics API: Query latency, throughput, error rates - Admin Console: Real-time dashboards - Audit logs: Performance anomalies logged

Example: Query last hour performance:

curl https://api.ankatech.co/analytics/performance \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "period": "last_1_hour",
    "metrics": ["latency_p95", "throughput", "error_rate"]
  }'

Output:

{
  "latency_p95": "8ms",
  "throughput": "1,200 ops/sec",
  "error_rate": "0.02%"
}

What's Next?

Ready to test performance? - 🚀 Run benchmarks (5-minute quick test) - 📥 Download performance toolkit (JMeter + scripts) - 📊 Compare your workload (estimate latency for your use case) - 📧 Request load testing session (we'll help size your deployment)

Optimize your deployment: - Algorithm selection guide - Choose best algorithm for your needs - Scaling guide - Horizontal scaling best practices - Caching strategies - Reduce latency with intelligent caching

Explore alternatives: - Composite keys performance - Detailed overhead analysis - Streaming performance - Multi-GB file benchmarks

Have questions? Email [email protected] or join our community forum

Benchmarks last updated: 2025-11-12 | Platform version: 3.0.0 | Test environment: Intel Xeon 8375C (16 cores), 32GB RAM