Java Collections Guide: HashMap, Hashtable, ConcurrentHashMap, HashSet (2025)


Java Collections

Java collections are fundamental data structures that every Java developer should understand. This comprehensive guide explores HashMap, Hashtable, ConcurrentHashMap, and HashSet, their differences, use cases, and best practices for thread-safe operations.

Pro Tip: Understanding the differences between these collections is crucial for writing efficient and thread-safe Java applications.

Introduction to Java Collections

Note: Java collections framework provides a unified architecture for representing and manipulating collections.

Java collections are implementations of various data structures that store and manipulate groups of objects. The main collections we'll discuss are:

  • HashMap: Most commonly used map implementation
  • Hashtable: Legacy thread-safe map implementation
  • ConcurrentHashMap: Modern thread-safe map implementation
  • HashSet: Set implementation based on HashMap

HashMap: The Most Popular Map Implementation

Pro Tip: HashMap is the most commonly used map implementation in Java due to its performance and flexibility.

HashMap is a hash table-based implementation of the Map interface. It provides constant-time performance for basic operations (get and put) assuming the hash function disperses the elements properly.

Key Features

  • Not synchronized (not thread-safe)
  • Allows null keys and values
  • No guarantee of order
  • Default initial capacity of 16
  • Load factor of 0.75

Example Implementation


// Creating a HashMap
HashMap map = new HashMap<>();

// Adding elements
map.put("one", 1);
map.put("two", 2);
map.put("three", 3);

// Getting values
Integer value = map.get("two"); // Returns 2

// Iterating through entries
for (Map.Entry entry : map.entrySet()) {
    System.out.println(entry.getKey() + ": " + entry.getValue());
}

Hashtable: The Legacy Thread-Safe Map

Note: Hashtable is a legacy class that has been superseded by ConcurrentHashMap in most cases.

Hashtable is a synchronized implementation of the Map interface. It was part of the original Java collections framework but has been largely replaced by ConcurrentHashMap for thread-safe operations.

Key Features

  • Synchronized (thread-safe)
  • Does not allow null keys or values
  • Slower than HashMap due to synchronization
  • Default initial capacity of 11
  • Load factor of 0.75

Example Implementation


// Creating a Hashtable
Hashtable table = new Hashtable<>();

// Adding elements
table.put("one", 1);
table.put("two", 2);
table.put("three", 3);

// Getting values
Integer value = table.get("two"); // Returns 2

// Iterating through entries
Enumeration keys = table.keys();
while (keys.hasMoreElements()) {
    String key = keys.nextElement();
    System.out.println(key + ": " + table.get(key));
}

ConcurrentHashMap: Modern Thread-Safe Map

Pro Tip: Use ConcurrentHashMap when you need thread-safe operations with better performance than Hashtable.

ConcurrentHashMap is a thread-safe implementation of the Map interface that provides better performance than Hashtable by using a different locking mechanism.

Key Features

  • Thread-safe without locking the entire table
  • Better performance than Hashtable
  • Allows concurrent read and write operations
  • No null keys or values allowed
  • Uses segment-level locking

Example Implementation


// Creating a ConcurrentHashMap
ConcurrentHashMap concurrentMap = new ConcurrentHashMap<>();

// Adding elements
concurrentMap.put("one", 1);
concurrentMap.put("two", 2);
concurrentMap.put("three", 3);

// Atomic operations
concurrentMap.computeIfAbsent("four", k -> 4);
concurrentMap.computeIfPresent("two", (k, v) -> v * 2);

// Iterating through entries
concurrentMap.forEach((key, value) -> 
    System.out.println(key + ": " + value));

HashSet: The Set Implementation

Note: HashSet is implemented using a HashMap internally, which means it shares many characteristics with HashMap.

HashSet is an implementation of the Set interface that uses a HashMap internally. It stores unique elements and provides constant-time performance for basic operations.

Key Features

  • Not synchronized (not thread-safe)
  • Allows null values
  • No guarantee of order
  • Uses HashMap internally
  • Constant-time performance for basic operations

Example Implementation


// Creating a HashSet
HashSet set = new HashSet<>();

// Adding elements
set.add("one");
set.add("two");
set.add("three");

// Checking for existence
boolean contains = set.contains("two"); // Returns true

// Iterating through elements
for (String element : set) {
    System.out.println(element);
}

Detailed Comparison

Pro Tip: Choose the right collection based on your specific requirements for thread-safety, performance, and functionality.

HashMap vs Hashtable vs ConcurrentHashMap

Feature HashMap Hashtable ConcurrentHashMap
Thread Safety No Yes Yes
Null Keys/Values Allowed Not Allowed Not Allowed
Performance Best Worst Good
Locking Mechanism None Table-level Segment-level

HashSet vs HashMap

Feature HashSet HashMap
Interface Set Map
Stores Single values Key-value pairs
Null Values Allowed Allowed
Internal Implementation Uses HashMap Hash table

Best Practices and Use Cases

Note: Always consider thread-safety requirements and performance implications when choosing a collection.

When to Use Each Collection

  • HashMap: Use when thread-safety is not required and you need the best performance
  • Hashtable: Avoid in new code; use ConcurrentHashMap instead
  • ConcurrentHashMap: Use when thread-safety is required and you need good performance
  • HashSet: Use when you need to store unique elements and don't need key-value pairs

Performance Considerations

  • Choose appropriate initial capacity to avoid resizing
  • Consider using ConcurrentHashMap instead of Hashtable for better performance
  • Use HashMap when thread-safety is not required
  • Consider using Collections.synchronizedMap() for simple thread-safety needs

Advanced Concepts

Pro Tip: Understanding the internal workings of ConcurrentHashMap can help you make better decisions about its usage in high-performance applications.

Segment-Level Locking in ConcurrentHashMap

ConcurrentHashMap uses a sophisticated locking mechanism called segment-level locking to achieve thread-safety while maintaining good performance. Here's how it works:

  • The map is divided into multiple segments (default is 16)
  • Each segment is protected by its own lock
  • Different threads can access different segments concurrently
  • Only one thread can modify a segment at a time
  • Multiple threads can read from the same segment simultaneously

Segment Structure


// Simplified segment structure
static final class Segment extends ReentrantLock {
    volatile HashEntry[] table;
    transient int count;
    transient int modCount;
    // ... other fields
}

Compare-And-Swap (CAS) Operations

ConcurrentHashMap uses CAS operations for atomic updates without locking. This is particularly efficient for read operations and certain types of updates.

Key CAS Operations

  • get(): Uses volatile reads for thread-safe access
  • putIfAbsent(): Uses CAS to atomically add new entries
  • replace(): Uses CAS to atomically update existing entries
  • remove(): Uses CAS to atomically remove entries

Example of CAS Implementation


// Simplified CAS operation in ConcurrentHashMap
V putIfAbsent(K key, V value) {
    int hash = hash(key.hashCode());
    return segmentFor(hash).put(key, hash, value, true);
}

// Inside Segment class
V put(K key, int hash, V value, boolean onlyIfAbsent) {
    lock();
    try {
        // ... implementation using CAS
        if (tabAt(tab, i) == null) {
            if (casTabAt(tab, i, null, new Node(hash, key, value, null)))
                break;
        }
    } finally {
        unlock();
    }
}

Performance Optimization Tips

Note: These optimizations are particularly important in high-concurrency scenarios.
  • Initial Capacity: Set an appropriate initial capacity to minimize resizing
  • Concurrency Level: Adjust the concurrency level based on expected thread count
  • Load Factor: Use a lower load factor for better performance in high-concurrency scenarios
  • Bulk Operations: Use bulk operations (putAll, clear) when possible
  • Iteration: Use ConcurrentHashMap's iterators for thread-safe iteration

Advanced Use Cases

  • High-Frequency Updates: Use ConcurrentHashMap for frequently updated data
  • Read-Heavy Workloads: Excellent for scenarios with many readers and few writers
  • Cache Implementation: Perfect for implementing thread-safe caches
  • Concurrent Data Structures: Use as a building block for other concurrent structures

Conclusion

Understanding the differences between HashMap, Hashtable, ConcurrentHashMap, and HashSet is crucial for writing efficient and thread-safe Java applications. Choose the right collection based on your specific requirements for thread-safety, performance, and functionality.