Couchbase Query Analysis Hub

How to Process and Generate Charts/Stats

1) Extract Data

Run this in Query Workbench, Capella Query tab, or cbq, then export as JSON.

SELECT *, meta().plan FROM system:completed_requests ORDER BY requestId LIMIT 2000;

Tip: For larger datasets, prefer targeted filters (see below) before exporting.

Sample completed_requests JSON

2) Paste or Upload JSON

Open Analyzer — in the upper-left input, paste or upload your JSON and it will auto-parse, or click Parse JSON.

Private & secure: Runs and processes your data entirely in your browser and its memory. No server uploads.

3) Parse JSON & Filters Overview

After parsing, use filters to focus analysis:

Date/Time Range — From/To with quick presets (Original | 1 Week | 1 Day | 1 Hour)
SQL++ Statement Contains — text filter for keywords/patterns
PRO – a great way to focus only on one query pattern and reduce the noise from other queries
Elapsed Time Filter — comparisons and ranges (e.g., >=500ms, 0.5s-2s)
Elapsed Filter Examples
Filter by the elapsedTime of each query using comparisons and ranges. Supported: <, <=, =, >, >=, ranges like 100-500ms or 0.5s-2s, unit-inferred ranges like 3-15s, shorthand 500ms+ (means >= 500ms), and bare numbers (e.g., 150 means >= 150ms).
- Comparators: >500ms, >=1s, <2s, =150ms
- Ranges (inclusive): 100-500ms, 0.5s-2s, 3-15s (unit applied to both sides)
- Shorthand: 500ms+ (same as >=500ms)
- Bare number: 150 is interpreted as >=150ms
Timezone Picker — Adjust the charts/graphs x-axis time to display in your chosen timezone. Select from the dropdown to convert all timestamps accordingly.
Exclude System Queries — hides system:*, INFER/ADVISE, CREATE/ALTER INDEX, etc.

4) Analyzer Tabs

Dashboard

High-level charts and distributions

Insights

Automated checks and quick wins

Timeline

Time-series performance with zoom

Query Groups

Normalized patterns and aggregates

Every Query

Sortable table of executions

Index/Query Flow

Relationship diagram

Indexes

Inventory and performance of indexes

5) Add Indexes JSON (Optional but Recommended)

Run this separate query for system:indexes and paste/upload to the upper-right input to enrich Indexes and Flow views.

SELECT 
    s.name,
    s.id,
    s.metadata,
    s.state,
    s.num_replica,
    s.`using` AS indexType,
    CONCAT("CREATE INDEX ", s.name, " ON ", k, ks, p, w, ";") AS indexString
FROM system:indexes AS s
LET bid = CONCAT("", s.bucket_id, ""),
    sid = CONCAT("", s.scope_id, ""),
    kid = CONCAT("", s.keyspace_id, ""),
    k = NVL2(bid, CONCAT2(".", bid, sid, kid), kid),
    ks = CASE WHEN s.is_primary THEN "" ELSE "(" || CONCAT2(",", s.index_key) || ")" END,
    w = CASE WHEN s.condition IS NOT NULL THEN " WHERE " || REPLACE(s.condition, '"', "'") ELSE "" END,
    p = CASE WHEN s.`partition` IS NOT NULL THEN " PARTITION BY " || s.`partition` ELSE "" END;

Sample system:indexes JSON

What the Stats and Tables Mean

Reading the Performance Data #

The Every Query tab lets you see stats per individual query to spot bottlenecks and get timings of operations in detail.

Query statistics table showing performance metrics

elapsedTime — Total wall-clock from request acknowledgment to last byte sent.
serviceTime — Active execution time while waiting on services (Index, Data, FTS).
executionTime — Internal query processing time (engine logic).
cpuTime — Cumulative CPU consumed across threads (can exceed wall time).
kernTime — Time spent waiting for CPU scheduling by the OS.
resultCount — Number of documents returned to the client.
resultSize — Total bytes sent; affects network transfer time.
phaseCounts.fetch — Number of full documents retrieved from data service
phaseCounts.indexScan — Number of matching records from the index(es) service

Watch-outs: large resultSize stretches elapsedTime; high usedMemory suggests heavy sorts/aggregations or big payloads.

System & Node Impact #

Performance is influenced by node CPU, memory, and service placement. Large results also increase network time.

kernTime — Time spent waiting for CPU scheduling by the OS.
cpuTime — Cumulative CPU consumed across threads (can exceed wall time).
memoryUsed — Cumulative Memory used to execute a query

The image above shows the Query service handling multiple concurrent requests on a single node. The OS kernel schedules query threads onto a limited number of CPU cores, so parts of a query must “wait” when resources are contended. This scheduling delay appears as kernTime at the operator level and stretches execution/elapsed time without doing useful work.

Limited CPU cores → thread contention and context switching under load
Memory pressure → GC/paging can increase pauses and reduce throughput
Service co-location (Query with Index/KV) → competition for CPU and memory on the same node
Large resultSize → longer post-execution network transfer time

Reduce contention by scaling query nodes, tuning concurrency, creating covering indexes to reduce fetches, and minimizing payload size.

ServiceTime in Context #

The Query service relies on other services in the cluster to operate (Data, Index, FTS, and Auth Service). This statistic is a sum of time spent waiting on all these services for this particular query.

Pro Insights: How to Read the Stats PRO #

High kernTime → CPU contention; check system load and concurrency.
High phaseCounts.fetch → add covering indexes; reduce fetches.
Large resultSize vs resultCount → optimize projection; narrow scans.
elapsedTime ≫ serviceTime → queuing/network; scale query nodes/services.
cpuTime high ~ serviceTime → CPU-bound; simplify expressions or add cores.

Quick patterns: elapsedTime ≫ serviceTime → queuing or network; cpuTime ~ serviceTime → CPU-bound; kernTime high → CPU contention.

See also: Performance Data and ServiceTime Context.

Crawl → Walk → Run

CRAWL · Basic Index Strategy # CRAWL: full index scan pattern

High phaseCounts.fetch; phaseTimes.fetch dominates.
Larger resultSize; moderate serviceTime.
Often primary or non-covering indexes; broad scans.

Success: reduce phaseCounts.fetch and phaseTimes.fetch by 30–50%.

See the glossary: serviceTime, elapsedTime

WALK · Targeted Composite Indexes # WALK: reduced index scan pattern

Reduced phaseCounts.fetch via better pre-filtering in indexes.
More selective phaseCounts.indexScan; improved serviceTime.
Projection refined; data scanned narrows.

Success: sustained drops in fetch counts/time; stable or lower resultSize; fewer I/O waits.

See the glossary: executionTime, serviceTime

RUN · Covering Indexes # RUN: covering index pattern

phaseCounts.fetch = 0 — no document fetches.
Minimal resultSize (project only needed fields).
Minimal serviceTime; low kernTime.

Success: 70–90% reduction in serviceTime; faster and more predictable queries.

See the glossary: kernTime, serviceTime

In Summary: Crawl / Walk / Run Metrics

CRAWL: High phaseCounts.fetch, large resultSize, higher serviceTime
WALK: Reduced fetch, more selective indexScan, improved serviceTime
RUN: fetch=0 (covering index), minimal resultSize, optimal serviceTime

Performance

What is a Couchbase Index? #

A Couchbase index is a data structure that creates shortcuts to your documents, allowing queries to find data efficiently without scanning entire buckets. When you create indexes with CREATE INDEX statements, they are stored and managed by the Index Service.

Couchbase CREATE INDEX commands showing how indexes are organized on Index Node with key-value pairs

How Index Creation Works

As shown in the diagram above, when you execute CREATE INDEX statements:

custId_status_v1: Creates an index on (status, custId) fields, organizing data like "done,1234", "done,5678", "draft,1234"
email_v1: Creates an index on (email) field, organizing data like "@yahoo.com", "@gmail.com", "@aol.com"
Index Node Storage: All indexes are stored on dedicated Index Service nodes
Key-Value Organization: Each index maintains sorted key-value pairs for fast lookups
Document References: Index values point to document keys (like 1234, 5678) for quick document retrieval

Index Memory Usage: The Performance Game-Changer #

The percentage of your index stored in memory directly impacts query performance. Higher memory residency means faster query execution, while lower memory residency can lead to disk I/O and slower responses.

High Memory Residency (90%+)

Couchbase index with high percentage of data in memory showing optimal performance

Optimal Performance: Most index data is in RAM, resulting in fast lookups and minimal disk I/O. Ideal for frequently accessed indexes.

⚠️ Low Memory Residency (20-40%)

Couchbase index with small percentage of data in memory showing performance impact

Performance Impact: Frequent disk reads required, leading to higher latency and reduced throughput. Consider increasing memory allocation.

Index Memory Optimization Strategies

Memory Optimization Tips

Monitor Memory Residency: Use our tool to track what percentage of your indexes are in memory
Increase Index RAM: Allocate more memory to the Index Service for frequently used indexes
Index Selectivity: Create more selective indexes to reduce overall memory footprint
Partition Indexes: Use partitioned indexes to distribute memory load across nodes
Archive Old Data: Remove or separate historical data that doesn't need high-performance access

Index Performance Impact

Memory residency directly affects:

Query Latency: In-memory indexes respond 10-100x faster than disk-based lookups
Throughput: Higher memory residency allows more concurrent queries without performance degradation
CPU Usage: Less CPU spent waiting for disk I/O when indexes are in memory
Consistency: Predictable performance when indexes don't compete for disk resources

⚠️ Drawbacks of Primary Indexes WARNING #

While primary indexes in Couchbase provide a basic way to scan all documents in a bucket by their keys, they come with significant drawbacks that make them unsuitable for most production scenarios. Primary indexes lead to very slow performance because they fetch all documents across all types in the bucket before applying filters, resulting in unnecessary I/O, memory, and CPU waste.

Performance Impact: Excessive document retrievals and post-scan filtering make operations "VERY EXPENSIVE"
Resource Waste: Unnecessary I/O, memory, and CPU consumption
Not Recommended: The Couchbase index advisor never recommends primary indexes
Better Alternatives: Secondary or composite indexes are almost always more efficient

Recommendation: Avoid primary indexes in production—use them only for initial data exploration or when no other index applies, and opt for targeted secondary indexes to minimize latency and resource usage.

⚠️ Drawbacks of Sequential Scans WARNING #

Sequential scans in Couchbase involve scanning documents directly without an index, which can be simple but comes with performance limitations. "Sequential scans are intended for simple, ready access to data, and are not intended as a high performance solution."

Limited Use Cases: Best suited to small collections where key order is unimportant
Index Overhead: Only when the overhead of maintaining an index can't be justified
Performance Issues: Lead to full bucket traversals, high latency, and increased resource consumption
Primary Index Alternative: For ordered document key operations, a primary index provides the same functionality and will outperform a sequential scan

Recommendation: Avoid sequential scans for large datasets or frequent queries—always prioritize indexing for scalability.

⚠️ Slow Parse and Plan Times #

Parse (SQL++ text → internal structure) and plan (optimizer choosing execution strategy) phases normally complete in microseconds. When either exceeds 1ms, it indicates CPU contention or kernel scheduling delays—queries waiting for CPU time instead of executing.

Example - Normal times (reference query):

Query: SELECT * , meta() FROM `travel-sample` ORDER BY `type` DESC

phaseTimes: {
  "parse": "271.666µs",    ✅ Normal (< 1ms)
  "plan": "441.75µs"       ✅ Normal (< 1ms)
}

Why times > 1ms are problematic:

CPU Saturation: Query service node lacks available CPU cores
Kernel Waits: Threads queued in OS scheduler (see kernTime)
Co-located Services: Query competing with Index/Data services on same node
High Concurrency: Too many simultaneous queries

Diagnose via:

system:vitals → cpu.user.percent (>80%), request.active.count
Operator stats → high kernTime throughout execution plan
Node config → check if services are co-located

Fixes: Reduce query concurrency, use prepared statements (skip parse), add dedicated query nodes, separate co-located services, or optimize complex queries.

⚠️ Complex JOIN Operations BETA #

JOIN operations in Couchbase N1QL/SQL++ can introduce significant performance overhead when not properly optimized. Complex JOINs are flagged based on multiple indicators (A-H) that signal resource contention, inefficient execution patterns, or unexpected data explosion.

Common JOIN Complexity Flags (A-H)

🔴 Flag A - Primary Scan in JOIN: Using primary index instead of secondary index during JOIN phase (Critical). The right-side keyspace lacks a proper index on the join predicate, forcing a full collection scan.
🔴 Flag B - Cartesian Product: CROSS JOIN or missing/poor ON clause causing exponential result multiplication (Critical). Example: joining 1,000 rows with 1,000 rows produces 1,000,000 results.
🔴 Flag D - Severe Data Explosion: Result set 10x+ larger than input documents (Critical). Often caused by UNNEST followed by JOIN where each array element joins to multiple records.
🟡 Flag C - Moderate Data Explosion: Result set 2x-10x larger than input (High). Indicates JOIN conditions that aren't selective enough or UNNEST creating intermediate expansion.
🟡 Flag E - Slow JOIN Phase: JOIN phase time ≥2 seconds (High). Join execution is taking too long, potentially due to large datasets, missing indexes, or poor join predicates.
🟡 Flag F - High Document Processing: JOIN processed ≥100,000 documents (High). Large intermediate result sets consuming memory and CPU.
🟡 Flag H - JOIN Time Dominant: JOIN phase consumes ≥30% of total query time (High). The join operation is the bottleneck in query execution.
🟠 Flag G - Multiple JOINs: Query contains 4+ JOIN keywords (Medium). Complex multi-table joins increase complexity and maintenance difficulty.

Example - UNNEST + JOIN Data Explosion

Query: SELECT o.*, p.* FROM orders o 
       UNNEST o.items AS item 
       JOIN products p ON KEYS item.productId

Input:  10,000 orders
UNNEST: 50,000 items (avg 5 items per order)
JOIN:   50,000 products matched
Result: 50,000 rows (5x explosion) ⚠️

Flags Triggered:
- D: 5.0x explosion (10,000 → 50,000)
- E: JOIN took 2.80s
- H: JOIN is 32.7% of query time

Why JOINs Can Be Problematic

Network Round-Trips: Each join may require fetching documents from Data Service
Memory Consumption: Intermediate result sets held in memory during execution
CPU Overhead: Matching and filtering join predicates
Index Dependencies: Performance heavily depends on proper index coverage on both sides
UNNEST Multiplication: Array expansion before JOIN multiplies dataset size exponentially

Optimization Strategies

Create Secondary Indexes: Ensure both left and right keyspaces have indexes on join keys (fixes Flag A)
Use Covering Indexes: Include projected fields in index definition to avoid document fetches
Add WHERE Filters Early: Filter data before JOIN to reduce intermediate result set size
Denormalize Data: For frequently joined data, embed related documents to eliminate JOINs
Optimize JOIN Order: Join smallest dataset first, then progressively add larger datasets
Review UNNEST Usage: Consider if array elements truly need joining or can be filtered first
Use Prepared Statements: Cache execution plans for frequently executed JOINs
Monitor JOIN Phase Metrics: Track phaseTimes.join and phaseCounts.join to identify bottlenecks

When to Denormalize vs JOIN

Denormalize (embed) if: Data is read frequently together, updates are infrequent, relationship is 1-to-1 or 1-to-few
Use JOIN if: Data changes frequently, relationship is many-to-many, need to maintain referential integrity
Hybrid Approach: Embed critical fields, JOIN for full details when needed

💡 Recommendation: Monitor JOIN queries with multiple complexity flags (especially A, B, D) as they often indicate systemic design issues. Consider denormalizing frequently-joined data paths or restructuring queries to reduce JOIN overhead.

⚠️ Concurrent Query Conflicts BETA #

Concurrent query conflicts occur when Couchbase services (Query, Data, Index) become overloaded, causing resource contention and performance degradation. Unlike peak vs. off-peak analysis, this insight detects service-level pressure through phase timing anomalies, regardless of when queries run.

Service Pressure Indicators (Flags A-H)

🟡 Flag A - Query Service Warning: Parse+Plan time >1ms (should be <1ms). Indicates Query Service CPU pressure when running hundreds of queries/sec on an 8-core system.
🔴 Flag B - Query Service Critical: Parse+Plan time >10ms. Query Service critically overloaded, immediate scaling required.
🟡 Flag C - Data Service Warning: Fetch >5ms per document (should be <1ms). KV service under pressure, possibly due to memory/disk I/O contention.
🔴 Flag D - Data Service Critical: Fetch >10ms per document. KV service critically slow, immediate investigation required.
🟡 Flag E - Index Service Warning: Index scan <5K records/sec (should be >10K). Index service running slow, potentially due to resource constraints.
🔴 Flag F - Index Service Critical: Index scan <1K records/sec. Index service critically slow, add replicas or optimize indexes.
🟡 Flag G - CPU Contention: Kernel time >30% of elapsed time. CPU scheduling overhead indicating processes fighting for cores.
🔴 Flag H - System-Wide Pressure: Multiple services (2+) under pressure simultaneously. System-wide resource exhaustion detected.

Example - System-Wide Pressure

Query: SELECT o.*, c.* FROM orders o JOIN customers c ON o.customerId = c.id

Metrics:
- Parse+Plan:  5.0ms (Flag A - should be <1ms)
- Fetch:       15ms per doc for 100 docs (Flag D - >10ms/doc)
- Kernel Time: 6,000ms / 15,000ms = 40% (Flag G - >30%)
- Services:    Query, Data, CPU/OS affected (Flag H)

Flags Triggered: A, D, G, H
Severity: 🔴 Critical - System-Wide Pressure

This indicates the entire Couchbase cluster is under heavy load with 
Query Service CPU contention, KV service slow fetches, and OS-level 
CPU scheduling overhead.

How Detection Works

Key Insight: system:completed_requests only captures slow queries (>1 second). We detect service pressure by analyzing phase timing anomalies, not by comparing fast vs. slow query baselines.

Query Service Pressure: Parse+Plan time should be <1ms. Values >1ms indicate Query Service CPU contention.
Data Service Pressure: Calculate fetch-per-document ratio. Normal: <1ms/doc, Warning: >5ms/doc, Critical: >10ms/doc.
Index Service Pressure: Calculate index scan throughput (records/sec). Normal: >10K/sec, Warning: <5K/sec, Critical: <1K/sec.
CPU Contention: Kernel time percentage indicates OS scheduler overhead. Normal: <10%, Warning: >30%, Critical: >50%.
System-Wide Issues: When 2+ services show pressure simultaneously, flag as cluster-wide resource exhaustion.

Resolution Strategies

Query Service Pressure (A, B): Scale Query nodes horizontally, reduce query concurrency, implement rate limiting.
Data Service Pressure (C, D): Review KV operations, check memory/disk I/O, add Data Service nodes, optimize document sizes.
Index Service Pressure (E, F): Add index replicas, optimize index definitions, scale Index Service nodes.
CPU Contention (G): Reduce concurrent query load, add CPU cores, review co-located processes, optimize query logic.
System-Wide Pressure (H): Critical - requires immediate cluster scaling or load reduction. Review workload distribution and resource allocation.

Service Pressure vs Peak Hour Analysis

Traditional Approach: Compare query times during peak vs. off-peak hours to identify concurrency issues.
Limitation: system:completed_requests only captures slow queries (>1 second), so we can't establish fast baselines.
Our Approach: Detect service pressure through phase timing anomalies (Parse+Plan >1ms, Fetch >5ms/doc, etc.) which indicate resource contention.
Advantage: Works with slow-query-only data and directly identifies which service (Query, Data, Index, CPU) is under pressure.

💡 Recommendation: Queries with Flag H (System-Wide Pressure) indicate cluster-level resource exhaustion. Immediate action required: scale horizontally, optimize workload, or implement query throttling. Monitor individual service flags (A-G) to identify specific bottlenecks for targeted optimization.

Glossary of Terms

scanConsistency

Controls freshness vs performance of index scans.

not_bounded (unbounded) — fastest; may return slightly stale results.
at_plus — waits for indexes to catch up to last known mutation time.
request_plus — strongest; indexes sync to this request’s timestamp (slowest).

Guidance: use unbounded for analytics, at_plus for recent writes, request_plus for strict freshness.

serviceTime

Calendar time actively executing the query, including waits on Index/Data/FTS services.

High vs elapsedTime → execution is the bottleneck.
Dominated by servTime in profiling → external services are slow.

See timing relationships

elapsedTime

Total end-to-end time: queuing + execution + result transmission.

elapsedTime ≫ execution/service → queuing or large result transfer.
Large resultSize or slow network stretches elapsedTime.

See timing diagram

executionTime

Pure query processing time in the Query service (parse → plan → execute), excluding waits on services and result transmission.

High values indicate computational complexity or inefficient plans.
Optimize via EXPLAIN, better indexes (covering), and simplified logic.
Context: see Performance Data for metric relationships.

cpuTime

Cumulative CPU across threads; may exceed wall time due to parallelism.

cpuTime ~ serviceTime → CPU‑bound workload.
Check sorts, aggregations, and complex expressions.

kernTime

Time the OS kernel spends scheduling the query’s threads (waiting for a CPU), not doing useful work.

High kernTime → CPU contention/overload; threads compete for cores.
Check node CPU metrics, thread counts, and co-located services; reduce concurrency or add cores.
Per operator: time ≈ execTime + servTime + kernTime; high kernTime stretches execution/elapsed time.

See also: Performance Data and System & Node Impact.

state

Execution status of a request.

completed — finished successfully
running — currently executing
cancelled — terminated before completion
timeout — exceeded configured timeout
error — failed; inspect errors

usedMemory

Peak document memory used (requires memory_quota).

High values indicate big payloads, large sorts/aggregations, or inefficient scans.
Correlate with resultSize and phase patterns; consider LIMIT/covering indexes.