Performance optimization is a critical aspect of Java application development. Well-optimized applications provide better user experiences, reduce infrastructure costs, and scale more effectively. This comprehensive guide explores advanced techniques for optimizing Java applications at various levels, from code design to runtime configuration.
Key areas of Java performance optimization include:
Effective performance optimization begins with accurate profiling to identify bottlenecks.
# Start JFR recording
java -XX:+FlightRecorder -XX:StartFlightRecording=duration=60s,filename=myrecording.jfr MyApplication
# Continuous recording with disk persistence
java -XX:+FlightRecorder -XX:StartFlightRecording=disk=true,dumponexit=true,maxage=12h,filename=myapp.jfr MyApplication
# Programmatic JFR recording
try (Recording recording = new Recording()) {
recording.enable("jdk.ObjectAllocationInNewTLAB")
.withThreshold(Duration.ofMillis(10));
recording.start();
// Run your workload
recording.dump(Path.of("recording.jfr"));
}
// Before: Creating new objects on each iteration
for (int i = 0; i < 1000000; i++) {
String result = "Value: " + i;
process(result);
}
// After: Reusing StringBuilder to reduce allocations
StringBuilder sb = new StringBuilder();
for (int i = 0; i < 1000000; i++) {
sb.setLength(0);
sb.append("Value: ").append(i);
process(sb.toString());
}
Selecting and tuning the appropriate garbage collector can significantly impact application performance.
Garbage Collector | Best For | Key Flags |
---|---|---|
G1GC (Default since JDK 9) | Most applications, balanced throughput and latency | -XX:+UseG1GC -XX:MaxGCPauseMillis=200 |
ZGC | Low-latency applications with large heaps | -XX:+UseZGC -XX:ConcGCThreads=N |
Shenandoah | Applications requiring consistent pause times | -XX:+UseShenandoahGC |
Parallel GC | Batch processing, maximizing throughput | -XX:+UseParallelGC -XX:GCTimeRatio=N |
# G1GC Tuning for low-latency applications
java -XX:+UseG1GC \
-XX:MaxGCPauseMillis=100 \
-XX:+ParallelRefProcEnabled \
-XX:G1HeapRegionSize=8m \
-XX:InitiatingHeapOccupancyPercent=45 \
-Xms4g -Xmx4g \
-jar myapp.jar
# ZGC Tuning for very large heaps
java -XX:+UseZGC \
-XX:ConcGCThreads=2 \
-XX:ZCollectionInterval=120 \
-Xms16g -Xmx16g \
-jar myapp.jar
String operations are common performance bottlenecks in Java applications.
// Inefficient string concatenation in a loop
String result = "";
for (int i = 0; i < items.size(); i++) {
result += items.get(i);
}
// Optimized with StringBuilder
StringBuilder sb = new StringBuilder(items.size() * 16); // Pre-size if possible
for (int i = 0; i < items.size(); i++) {
sb.append(items.get(i));
}
String result = sb.toString();
// Use String.join for simple concatenation
String result = String.join(",", items);
// Before: Inefficient loop
for (int i = 0; i < list.size(); i++) {
// Method call in condition on each iteration
process(list.get(i));
}
// After: Caching the size
int size = list.size();
for (int i = 0; i < size; i++) {
process(list.get(i));
}
// For collections, enhanced for loop or streams
for (Item item : list) {
process(item);
}
// Parallel processing for CPU-intensive operations
list.parallelStream()
.filter(Item::isValid)
.map(Item::process)
.collect(Collectors.toList());
Choosing the right data structure can dramatically impact performance.
Collection | Access | Insert | Search | Memory |
---|---|---|---|---|
ArrayList | O(1) | O(1)* / O(n) | O(n) | Low |
LinkedList | O(n) | O(1) | O(n) | High |
HashMap | O(1)* | O(1)* | O(1)* | Medium |
TreeMap | O(log n) | O(log n) | O(log n) | Medium |
HashSet | N/A | O(1)* | O(1)* | Medium |
* Average case, can degrade under certain conditions
Understanding how the JIT compiler works can help you write code that performs better at runtime.
# Enable advanced JIT optimizations
java -XX:+OptimizeStringConcat \
-XX:+DoEscapeAnalysis \
-XX:+EliminateAllocations \
-XX:+UseCompressedOops \
-jar myapp.jar
# Print JIT compilation information
java -XX:+PrintCompilation \
-XX:+UnlockDiagnosticVMOptions \
-XX:+PrintInlining \
-jar myapp.jar
The JIT compiler automatically inlines small, frequently called methods. Design your code with this in mind:
// Custom thread pool configuration
ThreadPoolExecutor executor = new ThreadPoolExecutor(
corePoolSize, // Core threads to keep alive
maxPoolSize, // Maximum pool size
keepAliveTime, // Time to keep idle non-core threads
TimeUnit.SECONDS,
new LinkedBlockingQueue<>(queueCapacity),
new ThreadPoolExecutor.CallerRunsPolicy());
// For CPU-bound tasks
int cpuThreads = Runtime.getRuntime().availableProcessors();
ExecutorService executorService = Executors.newFixedThreadPool(cpuThreads);
// For I/O-bound tasks
int ioThreads = Runtime.getRuntime().availableProcessors() * 2; // Common heuristic
ExecutorService ioExecutorService = Executors.newFixedThreadPool(ioThreads);
Excessive synchronization can cause contention and reduce performance.
// Before: Coarse-grained locking
public synchronized void processAll(List tasks) {
for (Task task : tasks) {
process(task);
}
}
// After: Fine-grained locking
public void processAll(List tasks) {
for (Task task : tasks) {
synchronized(this) {
process(task);
}
}
}
// Even better: Lock striping
private final Lock[] locks = new ReentrantLock[16]; // Multiple locks
{
for (int i = 0; i < locks.length; i++) {
locks[i] = new ReentrantLock();
}
}
public void processItem(Item item) {
int lockIndex = item.hashCode() % locks.length;
locks[lockIndex].lock();
try {
// Process with finer-grained lock
} finally {
locks[lockIndex].unlock();
}
}
Consider using non-blocking data structures from java.util.concurrent for high-contention scenarios.
// Concurrent collections for high-throughput scenarios
ConcurrentHashMap userCache = new ConcurrentHashMap<>();
ConcurrentLinkedQueue taskQueue = new ConcurrentLinkedQueue<>();
// Atomic operations for counters
AtomicLong counter = new AtomicLong(0);
long nextValue = counter.incrementAndGet();
// Lock-free operations with compare-and-swap
public boolean updateIfPresent(String key, Value oldValue, Value newValue) {
return map.replace(key, oldValue, newValue);
}
// Before: Unbuffered file reading
try (FileReader reader = new FileReader("large-file.txt")) {
int character;
while ((character = reader.read()) != -1) {
// Process each character
}
}
// After: Buffered reading
try (BufferedReader reader = new BufferedReader(
new FileReader("large-file.txt"), 8192)) { // Custom buffer size
String line;
while ((line = reader.readLine()) != null) {
// Process line
}
}
// NIO for large files
try (FileChannel channel = FileChannel.open(Path.of("huge-file.dat"),
StandardOpenOption.READ)) {
ByteBuffer buffer = ByteBuffer.allocateDirect(1024 * 1024); // 1MB buffer
while (channel.read(buffer) != -1) {
buffer.flip();
// Process buffer data
buffer.clear();
}
}
Reuse network connections to reduce the overhead of connection establishment.
// Database connection pooling with HikariCP
HikariConfig config = new HikariConfig();
config.setJdbcUrl("jdbc:postgresql://localhost:5432/mydb");
config.setUsername("username");
config.setPassword("password");
config.setMaximumPoolSize(10);
config.setMinimumIdle(5);
config.setIdleTimeout(30000);
HikariDataSource dataSource = new HikariDataSource(config);
// HTTP connection pooling with Apache HttpClient
PoolingHttpClientConnectionManager connectionManager =
new PoolingHttpClientConnectionManager();
connectionManager.setMaxTotal(200);
connectionManager.setDefaultMaxPerRoute(20);
HttpClient httpClient = HttpClients.custom()
.setConnectionManager(connectionManager)
.build();
// Simple in-memory cache with Caffeine
Cache userCache = Caffeine.newBuilder()
.maximumSize(10_000)
.expireAfterWrite(Duration.ofMinutes(10))
.recordStats()
.build();
// Retrieve or compute
User user = userCache.get(userId, key -> userService.fetchUser(key));
// Spring Boot caching
@Configuration
@EnableCaching
public class CacheConfig {
@Bean
public CacheManager cacheManager() {
CaffeineCacheManager cacheManager = new CaffeineCacheManager("users", "products");
cacheManager.setCaffeine(Caffeine.newBuilder()
.maximumSize(500)
.expireAfterWrite(Duration.ofMinutes(10)));
return cacheManager;
}
}
@Service
public class UserService {
@Cacheable(value = "users", key = "#userId")
public User getUser(String userId) {
// Expensive operation to fetch user
}
}
Performance optimization is an ongoing process that requires a systematic approach based on measurement, analysis, and targeted improvements. By applying the techniques described in this guide, you can identify and eliminate bottlenecks in your Java applications, resulting in better response times, higher throughput, and improved resource utilization.
Remember that premature optimization can lead to unnecessary complexity and maintenance challenges. Always start by measuring performance to identify actual bottlenecks rather than optimizing based on assumptions. Focus your optimization efforts on the critical paths that will provide the most significant benefits to your application's overall performance.