Profiling

GAIA includes optional PyInstrument middleware for detailed call stack profiling of FastAPI requests. This provides deep insights into performance bottlenecks and function execution times.

Overview

The profiling system uses PyInstrument to provide statistical profiling with:

Call stack visualization: See exactly which functions consume the most time
Interactive HTML reports: Drill down into performance bottlenecks
Minimal overhead: Statistical sampling keeps performance impact low
Production ready: Safe defaults and error resilience

Configuration

Profiling is completely optional and disabled by default. All configuration is done via environment variables.

Environment Variables

Add these to your .env file:

# Required: Enable profiling (disabled by default)
ENABLE_PROFILING=true

# Optional: Configure profiling behavior
PROFILING_SAMPLE_RATE=0.1          # Profile 10% of requests automatically
PROFILING_MAX_DEPTH=50             # Maximum call stack depth
PROFILING_ASYNC_MODE=enabled       # enabled, disabled, strict

Configuration Options

`ENABLE_PROFILING`

Type: boolean
Default: false
Description: Master switch for profiling functionality. Must be explicitly enabled.
Security: Disabled by default for safety in all environments.

`PROFILING_SAMPLE_RATE`

Type: float (0.0 to 1.0)
Default: 0.1 (10%)
Description: Fraction of requests to automatically profile in the background.
Examples:
- 0.0 = No automatic profiling (manual only)
- 0.01 = Profile 1% of requests (recommended for high-traffic production)
- 0.1 = Profile 10% of requests (good for development/staging)
- 1.0 = Profile all requests (only for debugging, high overhead)

`PROFILING_MAX_DEPTH`

Type: integer
Default: 50
Description: Maximum call stack depth to capture in profiling reports.
Impact: Higher values provide more detail but increase memory usage and report complexity.
Recommendations:
- Development: 50-100 (detailed debugging)
- Production: 30-50 (focused on main bottlenecks)

`PROFILING_ASYNC_MODE`

Type: string
Default: "enabled"
Options:
- "enabled": Profile async code with thread awareness
- "disabled": Don’t profile async code specifically
- "strict": Strict async profiling (may impact performance)
Description: Controls how PyInstrument handles async/await code profiling.

Usage

Manual Profiling

Add the profile query parameter to any request to get a detailed profiling report:

# Profile specific requests
curl "http://localhost:8000/api/v1/users?profile=1"
curl "http://localhost:8000/api/v1/chat-stream?profile=true"

Accepted values: 1, true, yes (case-insensitive) When manual profiling is requested:

You receive an HTML profiling report instead of the normal JSON response
The report shows detailed call stacks, execution times, and bottlenecks
Save the HTML and open in a browser for interactive analysis

Automatic Sampling

When PROFILING_SAMPLE_RATE > 0, requests are automatically profiled in the background:

Sampled requests are profiled transparently
Normal JSON responses are returned to clients
Profiling results are logged to the application logs
No impact on client experience or API contracts

Examples

Development Setup

# .env file for development
ENABLE_PROFILING=true
PROFILING_SAMPLE_RATE=0.2          # Profile 20% of requests
PROFILING_MAX_DEPTH=100            # Deep call stacks for debugging
PROFILING_ASYNC_MODE=enabled

Production Setup

# .env file for production
ENABLE_PROFILING=true
PROFILING_SAMPLE_RATE=0.01         # Profile only 1% of requests
PROFILING_MAX_DEPTH=30             # Focused profiling
PROFILING_ASYNC_MODE=enabled

Profiling API Requests

# Profile a chat completion request
curl "http://localhost:8000/api/v1/chat-stream?profile=1" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-token" \
  -d '{"message": "Explain quantum computing", "history": []}'

# Profile user authentication
curl "http://localhost:8000/api/v1/users/me?profile=1" \
  -H "Authorization: Bearer your-token"

# Profile any endpoint
curl "http://localhost:8000/api/v1/conversations?profile=true" \
  -H "Authorization: Bearer your-token"

Troubleshooting

Profiling Not Working

Check that ENABLE_PROFILING=true in your .env file
Verify PyInstrument is installed: pip install pyinstrument
Look for profiling logs at application startup
Try manual profiling first: ?profile=1

High Performance Impact

Reduce PROFILING_SAMPLE_RATE (e.g., from 0.1 to 0.01)
Lower PROFILING_MAX_DEPTH (e.g., from 100 to 30)
Use PROFILING_ASYNC_MODE=disabled if async overhead is too high

Note: Profiling adds computational overhead. Use judiciously in production environments and monitor the performance impact of your chosen sample rate.

Get Started

For Developers

Self-Hosting

Configuration

Extra Developer Docs

Profiling

Profiling

Overview

Configuration

Environment Variables

Configuration Options

`ENABLE_PROFILING`

`PROFILING_SAMPLE_RATE`

`PROFILING_MAX_DEPTH`

`PROFILING_ASYNC_MODE`

Usage

Manual Profiling

Automatic Sampling

Examples

Development Setup

Production Setup

Profiling API Requests

Troubleshooting

Profiling Not Working

High Performance Impact

Get Started

For Developers

Self-Hosting

Configuration

Extra Developer Docs

​Profiling

​Overview

​Configuration

​Environment Variables

​Configuration Options

​ENABLE_PROFILING

​PROFILING_SAMPLE_RATE

​PROFILING_MAX_DEPTH

​PROFILING_ASYNC_MODE

​Usage

​Manual Profiling

​Automatic Sampling

​Examples

​Development Setup

​Production Setup

​Profiling API Requests

​Troubleshooting

​Profiling Not Working

​High Performance Impact

Profiling

Overview

Configuration

Environment Variables

Configuration Options

`ENABLE_PROFILING`

`PROFILING_SAMPLE_RATE`

`PROFILING_MAX_DEPTH`

`PROFILING_ASYNC_MODE`

Usage

Manual Profiling

Automatic Sampling

Examples

Development Setup

Production Setup

Profiling API Requests

Troubleshooting

Profiling Not Working

High Performance Impact