Java Garbage Collection: In-Depth Guide

1️⃣ Introduction

Garbage Collection (GC) is a critical component of the Java Virtual Machine (JVM) that automatically manages memory by identifying and reclaiming objects no longer in use. Understanding garbage collection is essential for building high-performance Java applications, especially those with strict latency requirements or handling large data volumes.

This comprehensive guide explores Java garbage collection in depth, covering:

  • How garbage collection works in the JVM
  • Different garbage collector algorithms
  • GC tuning parameters and strategies
  • Monitoring and analyzing garbage collection performance
  • Best practices for optimizing memory management

2️⃣ Garbage Collection Fundamentals

🔹 Memory Management in the JVM

Java memory is divided into several key areas:

  • Heap: Where objects are allocated and garbage collection occurs
  • Stack: Contains method frames, local variables, and references
  • Metaspace (Java 8+): Stores class metadata and replaced PermGen
  • Code Cache: Where JIT-compiled code is stored
  • Native Memory: Memory used outside the JVM (DirectByteBuffers, native libraries)

🔹 Heap Generation Model

Traditional Java garbage collectors use a generational approach based on the observation that most objects die young ("Weak Generational Hypothesis").

Heap Generations

Generation Purpose Collection Frequency
Young Generation (Eden + Survivor spaces) Newly created objects Frequent (Minor GC)
Old Generation (Tenured) Long-lived objects promoted from Young Generation Less frequent (Major GC)

🔹 Object Lifecycle

The typical object lifecycle follows this pattern:

  1. Object is allocated in Eden space
  2. Minor GC occurs when Eden fills up
  3. Live objects move to Survivor space, age counter increments
  4. Objects continue to age with each minor GC
  5. After reaching threshold age, objects promote to Old Generation
  6. Major GC eventually collects unreferenced objects from Old Generation

3️⃣ Garbage Collector Algorithms

🔹 Serial Collector

The simplest collector, using a single thread for both minor and major collections.

java -XX:+UseSerialGC -Xms1g -Xmx1g MyApplication

Best for: Small applications with limited memory requirements and running on devices with single or dual-core processors.

🔹 Parallel Collector

Uses multiple threads for collection, maximizing throughput but still causing stop-the-world pauses.

java -XX:+UseParallelGC -XX:ParallelGCThreads=4 -Xms4g -Xmx4g MyApplication

Best for: Batch processing applications that prioritize throughput over latency.

🔹 G1 Garbage Collector (Garbage First)

Default since Java 9, G1GC is a server-style collector designed for multi-processor machines with large memories.

java -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -Xms4g -Xmx4g MyApplication

Key features:

  • Region-based memory division (not just young/old)
  • Incremental parallel compaction
  • Predictable pause times
  • Concurrent marking phase

🔹 ZGC (Z Garbage Collector)

Low-latency collector designed for very large heaps with pause times under 10ms.

java -XX:+UseZGC -Xms8g -Xmx8g MyApplication

Key features:

  • Concurrent garbage collection (almost no stop-the-world)
  • Colored pointers for concurrent operations
  • Scalable from small to multi-terabyte heaps

🔹 Shenandoah

Another low-latency collector similar to ZGC but with different implementation details.

java -XX:+UseShenandoahGC -Xms4g -Xmx4g MyApplication

4️⃣ GC Tuning Parameters

🔹 Heap Size Configuration

# Basic heap size settings
-Xms   # Initial heap size
-Xmx   # Maximum heap size

# Example
-Xms4g -Xmx4g              # Fixed heap size (recommended for production)
-Xms1g -Xmx4g              # Growing heap (less predictable)

🔹 Generation Sizing

# Young generation sizing
-XX:NewRatio=n             # Ratio of old to young generation
-XX:NewSize=n              # Initial young generation size
-XX:MaxNewSize=n           # Maximum young generation size
-Xmn                 # Shorthand for setting both NewSize and MaxNewSize

🔹 G1GC Specific Tuning

# G1GC pause time goal
-XX:MaxGCPauseMillis=200   # Target pause time in milliseconds

# Initiating heap occupancy
-XX:InitiatingHeapOccupancyPercent=45  # Percentage of heap that triggers concurrent marking

5️⃣ Monitoring and Analyzing GC

🔹 GC Logging

Enable detailed GC logging to understand collection behavior:

# Java 8
java -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:/path/to/gc.log MyApplication

# Java 9+
java -Xlog:gc*=info:file=/path/to/gc.log:time,uptime,level,tags MyApplication

🔹 GC Analysis Tools

  • GCViewer: Visualize and analyze GC log files
  • JVisualVM: Monitor and profile Java applications
  • JMC (Java Mission Control): Analyze JFR recordings for GC events

🔹 Key GC Metrics to Monitor

  • Pause Times: Duration of stop-the-world events
  • Frequency of Collections: How often collections occur
  • Heap Usage Before/After GC: Memory recovered per collection
  • GC Throughput: Percentage of time not spent in GC
  • Allocation Rate: How quickly memory is being allocated

6️⃣ Common GC Issues and Solutions

🔹 Long GC Pauses

  • Symptoms: Application response time spikes, high maximum pause times
  • Causes: Large heap, high allocation rate, too many live objects
  • Solutions:
    • Switch to low-latency collector (G1GC, ZGC)
    • Reduce heap size or old generation ratio
    • Reduce object allocation rate

🔹 High GC Frequency

  • Symptoms: Many minor collections per minute
  • Causes: Small young generation, high allocation rate
  • Solutions:
    • Increase young generation size (-Xmn or -XX:NewRatio)
    • Reduce object allocation (object pooling, reduce temporary objects)

🔹 Memory Leaks

  • Symptoms: Gradually increasing heap usage, OutOfMemoryError
  • Causes: Objects not being released due to erroneous references
  • Solutions:
    • Use heap dump analysis tools (jmap, Eclipse MAT)
    • Check for common leak sources (caches, ThreadLocal variables, static collections)
    • Use weak references for caches
// Generate a heap dump
jmap -dump:format=b,file=heapdump.hprof <pid>

// Enable automatic heap dumps on OutOfMemoryError
java -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/path/to/dumps MyApplication

7️⃣ GC Best Practices

🔹 General Recommendations

  • Use fixed heap size (-Xms equal to -Xmx) in production
  • Size heap appropriately based on application needs and available memory
  • Monitor and tune GC based on application-specific requirements
  • Choose the right collector for your use case

🔹 Application-Level Optimizations

  • Minimize object creation in critical code paths
  • Pool expensive objects when appropriate
  • Use primitive types instead of wrapper classes when possible
  • Be careful with String concatenation in loops

🔹 Collector Selection Guidelines

Collector Selection

Requirement Recommended Collector
Maximum throughput Parallel GC
Balanced throughput/latency G1GC
Minimum latency ZGC/Shenandoah
Small footprint Serial GC

8️⃣ Advanced Topics

🔹 GC and Containers

Running Java applications in containers requires special consideration for memory settings:

# Enable container support (automatic in Java 11+)
java -XX:+UseContainerSupport

# Set memory as percentage of container memory instead of fixed values
java -XX:MaxRAMPercentage=75.0 -XX:InitialRAMPercentage=75.0

🔹 GC-Friendly Data Structures

Some data structures create less GC pressure:

  • Off-heap structures (ByteBuffer.allocateDirect())
  • Primitive arrays instead of object collections
  • Compact specialized collections

9️⃣ Q&A / Frequently Asked Questions

Choose the right collector based on your application needs: (1) G1GC (default) provides a good balance of throughput and latency for most applications. It's particularly effective for heap sizes 4GB-32GB. (2) ZGC offers ultra-low pause times (<10ms), even with large heaps, but with slightly lower throughput. Use it for latency-sensitive applications. (3) Parallel GC maximizes throughput at the expense of longer pause times, making it ideal for batch processing. (4) Serial GC is best for small applications with minimal memory needs and lower CPU count environments. Always benchmark your specific application with different collectors, as performance characteristics vary based on allocation patterns and memory pressure.

OutOfMemoryError has several possible causes: (1) Memory leaks - objects remain referenced but unused, often in static collections or improper cache management. Fix by identifying with heap dump analysis and addressing retention issues. (2) Undersized heap - heap is too small for the application's memory requirements. Fix by increasing -Xmx or optimizing memory usage. (3) Metaspace exhaustion - too many classes loaded. Fix with -XX:MaxMetaspaceSize or by reducing class loading. (4) Native memory issues - DirectByteBuffers or native code using too much off-heap memory. Fix by monitoring and limiting native allocations. Enable automatic heap dumps (-XX:+HeapDumpOnOutOfMemoryError) to help diagnose the root cause.

For containerized environments: (1) Use JDK 11+ which automatically detects container constraints. For older JDKs, explicitly enable with -XX:+UseContainerSupport. (2) Use percentage-based memory flags like -XX:MaxRAMPercentage=75.0 instead of fixed values to adapt to container memory limits. (3) Set memory limits explicitly in container configuration. (4) Consider smaller but more numerous containers for better resource granularity. (5) Monitor both GC metrics and container metrics to ensure containers aren't being throttled or OOM-killed. (6) Use G1GC for most containerized workloads as it adapts well to varying resource availability.

🔟 Best Practices & Pro Tips 🚀

  • Use realistic production-like workloads when tuning GC
  • Monitor GC behavior after application updates
  • Start with defaults, tune only when necessary
  • Measure the impact of GC tuning changes
  • Keep object lifetimes short
  • Minimize large object allocations (to reduce humongous regions in G1GC)
  • Enable GC logging in production with log rotation
  • Scale horizontally with smaller heaps rather than vertically with very large heaps when possible
  • Use fewer, longer-lived objects rather than many short-lived ones

Read Next 📖

Conclusion

Understanding Java garbage collection is crucial for optimizing application performance, especially in production environments. The JVM offers several garbage collection algorithms, each with its own strengths and appropriate use cases. By selecting the right collector, tuning key parameters, and following best practices, you can significantly improve application performance and stability.

Remember that garbage collection is an ongoing concern rather than a one-time optimization. Monitor your application's GC behavior, analyze patterns, and adjust settings as your application evolves and usage patterns change.