Java Collections

This tutorial explains the usage of the Java Collections, e.g. List, Set and Map.

Java Collections

A collection is a structure that organizes multiple data items into a single group. The collections typically represent the data items that are related to each other. Collections are used to manipulate data (perform operations on it), store and retrieve data, and transfer data between methods.

A java collection consists of three elements:
  • Interfaces: Pretty much like abstract data types, and allow you to manipulate collections independently of the implementation details.
  • Implementations: The concrete implementations of the collections interfaces. You can look upon them as reusable data structures.
  • Algorithms: The methods that perform operations such as search and sorting on the objects in the data structures. You can also look upon them as reusable functionality.

The Collections Interfaces

The root of the hierarchy of the collections interfaces is the Collection interface, also referred to as the superinterface of the collections. There is another kind of collections called maps, which are represented by the superinterface Map, which is not derived from the Collection interface. Both kinds of collections interfaces are shown in below Figure.


Collection Hierarchy in Java




The List and Set interfaces extend from the Collection interface while SortedMap extends from Map interface.


List: An ordered collection of data items; i.e. you know exactly where each item is in the list. A list can contain duplicate elements. ArrayList, LinkedList, and Vector are the classes that implement the List interface.

Map: An object that maps keys to values: each key can map to at most one value. Maps cannot contain duplicate keys. HashMap and HashTable are examples of classes that implement the Map interface. No duplicate keys are allowed in maps; duplicate values are allowed.

Set: A collection of unique items; i.e. there are no duplicates. HashSet and LinkedHashSet are examples of classes that implement the Set interface.


List Interface Implementation

ArrayList: Think of ArrayList as an array that can grow in number of elements. Just like an array, it provides constant-time access to a specific element in the list, but insertions and deletions are linear in time. If you need to do insertions and deletions frequently and random access of elements rarely, use LinkedList, instead. So, random access (or search) is fast in ArrayList, but insertions and deletions are not.

LinkedList: A node in a LinkedList contains the data item and the pointer to the next node. So, a given element is searched by traveling from node to node. Therefore, random access in LinkedList is linear in time, but insertions and deletions are constant in time. So, insertions and deletions are fast in LinkedList, but random access (or search) is not.

Vector: Vector is pretty much like ArrayList but it provides support for synchronization for thread safety. If you are not programming for a multithreaded environment (that is, you don’t need thread safety), use ArrayList instead of Vectors, because otherwise you are impeding the performance for no gain. Even if you are programming for a multithreaded environment, you can still use ArrayList and use the synchronization methods provided by the Collections class.

Set Interface Implementation

HashSet: This provides the faster access to a data item as compared to TreeSet, but it offers no guarantee that the items will be ordered. Assume that it is unsorted and unordered. Note that it does not offer synchronization, so if you use it in a multithreaded environment, you may need to use the synchronization methods from the Collections class.

TreeSet: This presents sorted data items, but the access performance is not as good as HashSet. Note that it does not offer synchronization, so if you use it in a multithreaded environment, you may need to use the synchronization methods from the Collections class.

LinkedHashSet: A LinkedHashSet is like a HashSet that maintains a doubly linked list that runs through all the data items. It is an ordered collection, ordered by insertion, but not sorted. Note that it does not offer synchronization, so if you use it in a multithreaded environment, you may need to use the synchronization methods from the Collections class.

Map Interface Implementation

HashTable: This implementation is based on the hashtable data structure, and it can use any non-null object as a key or a value. It offers no guarantee that the data items will stay ordered. The objects used as keys must implement the hashCode() method and the equals(…) method in order to successfully store and retrieve objects from a hashtable. This implementation is synchronized.

HashMap: This implementation is also based on the hashtable data structure. It is like the HashTable implementation, but it allows null and is unsynchronized. It offers no guarantee that the data items will stay ordered.

LinkedHashMap: This implementation is different from HashMap in that it maintains a doubly linked list that runs through all of its entries. Furthermore, it defines the iteration order, which is usually the order in which the keys are inserted into the map. Use this list if the unspecified ordering provided by Hashtable and HashMap is not suitable for your application.

TreeMap: TreeMap implements the SortedMap interface. It guarantees that the map will be in the ascending key order (that is, sorted). This implementation is unsynchronized.


Reference(s): SCJP Exam for J2SE 5

1 comment: