Performance Analysis (pprof)
Overview
pprof is a powerful performance analysis tool in the Go ecosystem, effective for diagnosing various performance issues such as CPU, memory, and goroutines. The sponge framework deeply integrates pprof's capabilities, offering multiple flexible methods for profile collection, making it especially suitable for troubleshooting in production environments.
HTTP API Collection Method
Web and gRPC services created with sponge both support performance analysis via an HTTP interface. This feature is disabled by default and can be enabled through the configuration file configs/*.yaml
.
app:
enableHTTPProfile: true # Enable HTTP interface for performance analysis, default route is /debug/pprof
Typical Use Cases:
- Real-time analysis in development/testing environments
- Performance monitoring during stress testing
Notes:
- Enabling this in a production environment will incur a performance overhead of about 1%.
- In addition to standard pprof features, it also extends to IO analysis capabilities (accessible at
/debug/pprof/profile-io
).
Default Access URLs:
- Web service: http://localhost:8080/debug/pprof
- gRPC service: http://localhost:8283/debug/pprof
Use with go tool pprof
for immediate analysis:
# Memory analysis example
go tool pprof http://localhost:8080/debug/pprof/heap
Signal-Triggered Collection Method
Services created with sponge support profile collection triggered by system signals by default. This feature is enabled by default and requires no extra configuration.
Technical Highlights:
- Uses the SIGTRAP(5) signal by default.
- Start/stop mechanism: the first signal starts collection, a second signal terminates it early.
- A fail-safe design that automatically stops after 60 seconds by default.
How-to Guide:
# Find the process ID
pgrep -f "service-name" # Recommended method
ps aux | grep "service-name"
# Trigger collection
kill -SIGTRAP $PID
Output File Naming Convention: datetime_pid_uid_profile-type.out
/tmp/service-name_profile/
├── 20240726154302_1234_user_cpu.out
├── 20240726154302_1234_user_heap.out
├── 20240726154302_1234_user_goroutine.out
├── 20240726154302_1234_user_block.out
├── 20240726154302_1234_user_mutex.out
└──20240726154302_1234_user_threadcreate.out
Advanced Configuration:
// Enable trace collection (note the storage overhead)
prof.EnableTrace()
Analysis Method:
# Flame graph analysis
go tool pprof -http=:8080 /tmp/service-name_profile/20240726154302_1234_user_cpu.out
Intelligent Adaptive Collection Method
Services created with sponge support automatic profile collection triggered by system resource monitoring. This feature is enabled by default.
app:
enableStat: true # whether to turn on printing statistics, true:enable, false:disable
sponge innovatively combines signal-triggered collection with resource monitoring to achieve intelligent profile collection.
Triggering Mechanism:
Metric Type | Sampling Strategy | Condition to Trigger Profile Collection |
---|---|---|
CPU | Default: 1 time/minute | Average of the last 3 samples > 80% |
Memory | Default: 1 time/minute | Average of the last 3 samples > 80% |
The output specification for the collected profile files is the same as the Signal-Triggered Collection Method
.
Technical Advantages:
- Automatically correlates monitoring with profile collection.
- Capable of diagnosing issues overnight.
Typical Workflow:
Resource Monitoring → Threshold Breach → Signal Trigger → Automatic Collection → Save to Local File → Analysis
It is recommended to log trigger events in a logging system for easier correlation analysis later.
Note
This feature is not supported on Windows systems.