Performance Analysis (pprof)

VisonAbout 2 mincomponentgoroutineheappprofblock

Overview

pprof is a powerful performance analysis tool in the Go ecosystem, effective for diagnosing various performance issues such as CPU, memory, and goroutines. The sponge framework deeply integrates pprof's capabilities, offering multiple flexible methods for profile collection, making it especially suitable for troubleshooting in production environments.

HTTP API Collection Method

Web and gRPC services created with sponge both support performance analysis via an HTTP interface. This feature is disabled by default and can be enabled through the configuration file configs/*.yaml.

app:
  enableHTTPProfile: true  # Enable HTTP interface for performance analysis, default route is /debug/pprof

Typical Use Cases:

Real-time analysis in development/testing environments
Performance monitoring during stress testing

Notes:

Enabling this in a production environment will incur a performance overhead of about 1%.
In addition to standard pprof features, it also extends to IO analysis capabilities (accessible at /debug/pprof/profile-io).

Default Access URLs:

Web service: http://localhost:8080/debug/pprof
gRPC service: http://localhost:8283/debug/pprof

Use with go tool pprof for immediate analysis:

# Memory analysis example
go tool pprof http://localhost:8080/debug/pprof/heap

Signal-Triggered Collection Method

Services created with sponge support profile collection triggered by system signals by default. This feature is enabled by default and requires no extra configuration.

Technical Highlights:

Uses the SIGTRAP(5) signal by default.
Start/stop mechanism: the first signal starts collection, a second signal terminates it early.
A fail-safe design that automatically stops after 60 seconds by default.

How-to Guide:

# Find the process ID
pgrep -f "service-name"  # Recommended method
ps aux | grep "service-name"

# Trigger collection
kill -SIGTRAP $PID

Output File Naming Convention: datetime_pid_uid_profile-type.out

/tmp/service-name_profile/
├── 20240726154302_1234_user_cpu.out
├── 20240726154302_1234_user_heap.out
├── 20240726154302_1234_user_goroutine.out
├── 20240726154302_1234_user_block.out
├── 20240726154302_1234_user_mutex.out
└──20240726154302_1234_user_threadcreate.out

Advanced Configuration:

// Enable trace collection (note the storage overhead)
prof.EnableTrace()

Analysis Method:

# Flame graph analysis
go tool pprof -http=:8080 /tmp/service-name_profile/20240726154302_1234_user_cpu.out

Intelligent Adaptive Collection Method

Services created with sponge support automatic profile collection triggered by system resource monitoring. This feature is enabled by default.

app:
  enableStat: true  # whether to turn on printing statistics, true:enable, false:disable

sponge innovatively combines signal-triggered collection with resource monitoring to achieve intelligent profile collection.

Triggering Mechanism:

Metric Type	Sampling Strategy	Condition to Trigger Profile Collection
CPU	Default: 1 time/minute	Average of the last 3 samples > 80%
Memory	Default: 1 time/minute	Average of the last 3 samples > 80%

The output specification for the collected profile files is the same as the Signal-Triggered Collection Method.

Technical Advantages:

Automatically correlates monitoring with profile collection.
Capable of diagnosing issues overnight.

Typical Workflow:

Resource Monitoring → Threshold Breach → Signal Trigger → Automatic Collection → Save to Local File → Analysis

It is recommended to log trigger events in a logging system for easier correlation analysis later.

Note

This feature is not supported on Windows systems.