Observability is a critical aspect of modern Java application development, allowing teams to understand complex system behavior, identify performance bottlenecks, and quickly diagnose issues in production environments.
This comprehensive guide covers the key pillars of observability:
By implementing effective observability practices, development teams can gain visibility into application behavior, improve troubleshooting, and enhance overall system reliability.
Metrics provide numerical data about application behavior and performance, allowing teams to monitor trends, identify anomalies, and set alerts.
Micrometer provides a vendor-neutral metrics collection API for Java applications.
// Add dependencies in pom.xml
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
<version>1.10.2</version>
</dependency>
// Create a counter
Counter requestCounter = Metrics.counter("http.requests",
"uri", "/api/users",
"method", "GET");
// Increment the counter
requestCounter.increment();
// Create a timer
Timer responseTimer = Metrics.timer("http.response.time",
"uri", "/api/users");
// Record request duration
responseTimer.record(() -> {
// Method that makes the HTTP request
return processRequest();
});
Effective logging provides context-rich information about application events and errors, aiding in troubleshooting and analysis.
// Add dependencies
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>2.0.6</version>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
<version>1.4.5</version>
</dependency>
<!-- logback.xml configuration for JSON output -->
<appender name="JSON" class="ch.qos.logback.core.ConsoleAppender">
<encoder class="net.logstash.logback.encoder.LogstashEncoder"/>
</appender>
Distributed tracing tracks requests as they flow through microservices, providing visibility into end-to-end transactions.
OpenTelemetry provides vendor-neutral APIs, libraries, and agents for collecting traces, metrics, and logs.
// Add OpenTelemetry dependencies
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-api</artifactId>
<version>1.22.0</version>
</dependency>
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-sdk</artifactId>
<version>1.22.0</version>
</dependency>
// Get the current span from the context
Span currentSpan = tracer.spanBuilder("processOrder")
.setSpanKind(SpanKind.INTERNAL)
.setAttribute("orderId", orderId)
.startSpan();
try (Scope scope = currentSpan.makeCurrent()) {
// Execute the business logic
processOrderItems(orderId);
} catch (Exception e) {
currentSpan.recordException(e);
currentSpan.setStatus(StatusCode.ERROR, e.getMessage());
throw e;
} finally {
currentSpan.end();
}
For distributed tracing to work across service boundaries, context must be propagated through various transport mechanisms:
Real-time health monitoring enables proactive issue detection and resolution.
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
# application.properties
management.endpoints.web.exposure.include=health,info,metrics,prometheus
management.endpoint.health.show-details=always
@Component
public class DatabaseHealthIndicator implements HealthIndicator {
private final DataSource dataSource;
public DatabaseHealthIndicator(DataSource dataSource) {
this.dataSource = dataSource;
}
@Override
public Health health() {
try (Connection conn = dataSource.getConnection()) {
try (Statement stmt = conn.createStatement()) {
stmt.execute("SELECT 1");
return Health.up()
.withDetail("database", "PostgreSQL")
.withDetail("version", getDatabaseVersion(conn))
.build();
}
} catch (SQLException e) {
return Health.down()
.withDetail("error", e.getMessage())
.build();
}
}
private String getDatabaseVersion(Connection conn) throws SQLException {
try (Statement stmt = conn.createStatement()) {
try (ResultSet rs = stmt.executeQuery("SELECT version()")) {
return rs.next() ? rs.getString(1) : "unknown";
}
}
}
}
A variety of tools are available for collecting, storing, and visualizing observability data:
Stack | Components | Best For |
---|---|---|
ELK Stack | Elasticsearch, Logstash, Kibana | Log aggregation and analysis |
Prometheus + Grafana | Prometheus, Alertmanager, Grafana | Metrics collection and visualization |
Jaeger | Jaeger Collector, Query Service, UI | Distributed tracing |
Datadog | Unified SaaS platform | Enterprise-scale observability |
New Relic | Unified SaaS platform | Full-stack observability |
Implementing robust observability practices is critical for modern Java applications, particularly in distributed and microservice architectures. By combining metrics, logging, tracing, and health monitoring, development teams can gain comprehensive visibility into application behavior, leading to faster troubleshooting, more reliable systems, and improved user experiences.
Start with small, focused improvements to your observability stack, prioritizing the areas that provide the most value for your specific use cases. As your applications evolve, continuously refine your observability strategy to address new challenges and leverage emerging tools and practices.