Profiling

GAIA includes optional PyInstrument middleware for detailed call stack profiling of FastAPI requests. This provides deep insights into performance bottlenecks and function execution times.

Overview

The profiling system uses PyInstrument to provide statistical profiling with:
  • Call stack visualization: See exactly which functions consume the most time
  • Interactive HTML reports: Drill down into performance bottlenecks
  • Minimal overhead: Statistical sampling keeps performance impact low
  • Production ready: Safe defaults and error resilience

Configuration

Profiling is completely optional and disabled by default. All configuration is done via environment variables.

Environment Variables

Add these to your .env file:
# Required: Enable profiling (disabled by default)
ENABLE_PROFILING=true

# Optional: Configure profiling behavior
PROFILING_SAMPLE_RATE=0.1          # Profile 10% of requests automatically
PROFILING_MAX_DEPTH=50             # Maximum call stack depth
PROFILING_ASYNC_MODE=enabled       # enabled, disabled, strict

Configuration Options

ENABLE_PROFILING

  • Type: boolean
  • Default: false
  • Description: Master switch for profiling functionality. Must be explicitly enabled.
  • Security: Disabled by default for safety in all environments.

PROFILING_SAMPLE_RATE

  • Type: float (0.0 to 1.0)
  • Default: 0.1 (10%)
  • Description: Fraction of requests to automatically profile in the background.
  • Examples:
    • 0.0 = No automatic profiling (manual only)
    • 0.01 = Profile 1% of requests (recommended for high-traffic production)
    • 0.1 = Profile 10% of requests (good for development/staging)
    • 1.0 = Profile all requests (only for debugging, high overhead)

PROFILING_MAX_DEPTH

  • Type: integer
  • Default: 50
  • Description: Maximum call stack depth to capture in profiling reports.
  • Impact: Higher values provide more detail but increase memory usage and report complexity.
  • Recommendations:
    • Development: 50-100 (detailed debugging)
    • Production: 30-50 (focused on main bottlenecks)

PROFILING_ASYNC_MODE

  • Type: string
  • Default: "enabled"
  • Options:
    • "enabled": Profile async code with thread awareness
    • "disabled": Don’t profile async code specifically
    • "strict": Strict async profiling (may impact performance)
  • Description: Controls how PyInstrument handles async/await code profiling.

Usage

Manual Profiling

Add the profile query parameter to any request to get a detailed profiling report:
# Profile specific requests
curl "http://localhost:8000/api/v1/users?profile=1"
curl "http://localhost:8000/api/v1/chat-stream?profile=true"
Accepted values: 1, true, yes (case-insensitive) When manual profiling is requested:
  • You receive an HTML profiling report instead of the normal JSON response
  • The report shows detailed call stacks, execution times, and bottlenecks
  • Save the HTML and open in a browser for interactive analysis

Automatic Sampling

When PROFILING_SAMPLE_RATE > 0, requests are automatically profiled in the background:
  • Sampled requests are profiled transparently
  • Normal JSON responses are returned to clients
  • Profiling results are logged to the application logs
  • No impact on client experience or API contracts

Examples

Development Setup

# .env file for development
ENABLE_PROFILING=true
PROFILING_SAMPLE_RATE=0.2          # Profile 20% of requests
PROFILING_MAX_DEPTH=100            # Deep call stacks for debugging
PROFILING_ASYNC_MODE=enabled

Production Setup

# .env file for production
ENABLE_PROFILING=true
PROFILING_SAMPLE_RATE=0.01         # Profile only 1% of requests
PROFILING_MAX_DEPTH=30             # Focused profiling
PROFILING_ASYNC_MODE=enabled

Profiling API Requests

# Profile a chat completion request
curl "http://localhost:8000/api/v1/chat-stream?profile=1" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-token" \
  -d '{"message": "Explain quantum computing", "history": []}'

# Profile user authentication
curl "http://localhost:8000/api/v1/users/me?profile=1" \
  -H "Authorization: Bearer your-token"

# Profile any endpoint
curl "http://localhost:8000/api/v1/conversations?profile=true" \
  -H "Authorization: Bearer your-token"

Troubleshooting

Profiling Not Working

  1. Check that ENABLE_PROFILING=true in your .env file
  2. Verify PyInstrument is installed: pip install pyinstrument
  3. Look for profiling logs at application startup
  4. Try manual profiling first: ?profile=1

High Performance Impact

  1. Reduce PROFILING_SAMPLE_RATE (e.g., from 0.1 to 0.01)
  2. Lower PROFILING_MAX_DEPTH (e.g., from 100 to 30)
  3. Use PROFILING_ASYNC_MODE=disabled if async overhead is too high

Note: Profiling adds computational overhead. Use judiciously in production environments and monitor the performance impact of your chosen sample rate.