Java variable type Collection for HashSet or other implementations?

hashset in java
collections in java
java set implementations
java hashmap
treeset in java
hashset java 8
java list
arraylist in java

I have often seen declarations like List<String> list = new ArrayList<>(); or Set<String> set = new HashSet<>(); for fields in classes. For me it makes perfect sense to use the interfaces for the variable types to provide flexibility in the implementation. The examples above do still define which kind of Collections have to be used, respectively which operations are allowed and how it should behave in some cases (due to docs).

Now consider the case where actually only the functionality of the Collection (or even the Iterable) interface is required to use the field in the class and the kind of Collection doesn't actually matter or I don't want to overspecify it. So I choose for example HashSet as implementation and declare the field as Collection<String> collection = new HashSet<>();.

Should the field then actually be of type Set in this case? Is this kind of declaration bad practice, if so, why? Or is it good practice to specify the actual type as less as possible (and still provide all required methods). The reason why I ask this is because I have hardly ever seen such a declaration and lately I get more an more in the situation where I only need to specify the functionality of the Collection interface.

Example:

// Only need Collection features, but decided to use a LinkedList
private final Collection<Listener> registeredListeners = new LinkedList<>();

public void init() {
    ExampleListener listener = new ExampleListener();
    registerListenerSomewhere(listener);
    registeredListeners.add(listener);
    listener = new ExampleListener();
    registerListenerSomewhere(listener);
    registeredListeners.add(listener);
}

public void reset() {
    for (Listener listener : registeredListeners) {
        unregisterListenerSomewhere(listener);
    }

    registeredListeners.clear();
}

Since your example uses a private field it doesn't matter all that much about hiding the implementation type. You (or whoever is maintaining this class) can always just go look at the field's initializer to see what it is.

Depending on how it's used, though, it might be worth declaring a more specific interface for the field. Declaring it to be a List indicates that duplicates are allowed and that ordering is significant. Declaring it to be a Set indicates that duplicates aren't allowed and that ordering is not significant. You might even declare the field to have a particular implementation class if there's something about it that's significant. For example, declaring it to be LinkedHashSet indicates that duplicates aren't allowed but that ordering is significant.

The choice of whether to use an interface, and what interface to use, becomes much more significant if the type appears in the public API of the class, and on what the compatibility constraints on this class are. For example, suppose there were a method

public ??? getRegisteredListeners() {
    return ...
}

Now the choice of return type affects other classes. If you can change all the callers, maybe it's no big deal, you just have to edited other files. But suppose the caller is an application that you have no control over. Now the choice of interface is critical, as you can't change it without potentially breaking the applications. The rule here is usually to choose the most abstract interface that supports the operations you expect callers to want to perform.

Most of the Java SE APIs return Collection. This provides a fair degree of abstraction from the underlying implementation, but it also provides the caller a reasonable set of operations. The caller can iterate, get the size, do a contains check, or copy all the elements to another collection.

Some code bases use Iterable as the most-abstract interface to return. All it does is allow the caller to iterate. Sometimes this is all that's necessary, but it might be somewhat limiting compared to Collection.

Another alternative is to return a Stream. This is helpful if you think the caller might want to use stream's operations (such as filter, map, find, etc.) instead of iterating or using collection operations.

Note that if you choose to return Collection or Iterable, you need to make sure that you return an unmodifiable view or make a defensive copy. Otherwise, callers could modify your class's internal data, which would probably lead to bugs. (Yes, even an Iterable can permit modification! Consider getting an Iterator and then calling the remove() method.) If you return a Stream, you don't need to worry about that, since you can't use a Stream to modify the underlying source.

Note that I turned your question about the declaration of a field into a question about the declaration of method return types. There is this idea of "program to the interface" that's quite prevalent in Java. In my opinion it doesn't matter very much for local variables (which is why it's usually fine to use var), and it matters little for private fields, since those (almost) by definition affect only the class in which they're declared. However, the "program to the interface" principle is very important for API signatures, so those cases are where you really need to think about interface types. Private fields, not so much.

(One final note: there is a case where you need to be concerned about the types of private fields, and that's when you're using a reflective framework that manipulates private fields directly. In that case, you need to think of those fields as being public -- just like method return types -- even though they're not declared public.)

The Set Interface (The Java™ Tutorials > Collections > Interfaces), This collections Java tutorial describes interfaces, implementations, and algorithms in HashSet , which stores its elements in a hash table, is the best- performing 's implementation type rather than its interface type, all such variables and� As you can see from the table, the Java Collections Framework provides several general-purpose implementations of the Set, List, and Map interfaces. In each case, one implementation — HashSet , ArrayList , and HashMap — is clearly the one to use for most applications, all other things being equal.

Set Implementations (The Java™ Tutorials > Collections , If you need to use the operations in the SortedSet interface, or if value-ordered iteration is required, use TreeSet ; otherwise, use HashSet . It's a fair bet that� The Java Collections Framework is a set of classes, Interfaces, and methods that provide us various data structures like LinkedList, ArrayList, HashMap, HashSet etc. It also contains implementations of numerous algorithms that help us working with the data structures in an efficient manner.

It really depends on what you want to do with the collection object.

Collection<String> cSet = new HashSet<>();
Collection<String> cList = new ArrayList<>();

Here in this case if you want you can do :

cSet = cList;

But if you do like :

Set<String> cSet = new HashSet<>(); 

the above operation is not permissible though you can construct a new list using the constructor.

 Set<String> set = new HashSet<>();
 List<String> list = new ArrayList<>();
 list = new ArrayList<>(set);

So basically depending on the usage you can use Collection or Set interface.

HashSet in Java, The HashSet class implements the Set interface, backed by a hash table NOTE : The implementation in a HashSet is not synchronized, in the sense that if If no such object exists, the set should be “wrapped” using the Collections. as key to the map Object and for its value java uses a constant variable. LinkedHashset in Java - Duplicate objects are not allowed if we are trying to insert duplicate values then we wont get any compilation errors an won't get any Execution errors simply add method return false.

The Collection Framework - Java Programming Tutorial, In Java, dynamically allocated data structures (such as ArrayList , LinkedList the elements of a collection, regardless of the underlying actual implementation. from the type of the assigned variable List<String> coffeeLst = new ArrayList<> (); Therefore, you should use interface type arguments and variables whenever it is conceivable that different implementations may be passed into the method you're implementing. For example, if you're working with a HashSet<T> instance, you should use a variable of type Set<T> to refer to it (class HashSet<T> implements interface Set<T> ).

Java Set, The Java Set interface represents a collection of objects where each element in explains how the Java Set interface and its implementations work. HashSet; public class SetExample { public static void main(String[] args) If the set has a generic type specified, you can use that type as the variable type� In effect, the latter constructor allows the user to copy any collection, producing an equivalent collection of the desired implementation type. There is no way to enforce this convention (as interfaces cannot contain constructors) but all of the general-purpose Collection implementations in the Java platform libraries comply.

Java HashSet Tutorial with Examples, Java HashSet class is a member of Java collections framework. It implements the Set interface. HashSets are used to store a collection of unique elements. HashSet will use the `equals()` & `hashCode()` implementations of the Customer class to check Reading and Writing Environment Variables in Go. From an implementation perspective, the add method is an extremely important one. Implementation details illustrate how the HashSet works internally and leverages the HashMap's put method: public boolean add(E e) { return map.put(e, PRESENT) == null; } The map variable is a reference to the internal, backing HashMap:

Comments
  • Great question. A lot of it is a matter of taste and convention. I find List or Set cleaner and clearer in most cases, with the exception of method inputs, which should usually be as generic as possible.
  • The question was already answered in a broader sense, so as a seemingly overly specific question that is related to your particular example: What happens when somebody calls the init() method twice? Just think about how the choice of Set vs. List here interferes with the behavior of the registerListenerSomewhere method: Will it there be stored in a Set or in a List? If it's stored in a List there, but in a Set in your class, then calling reset will only remove one listener instance, but not all of them. It's difficult...
  • Excellent answer. The crux is The rule here is usually to choose the most abstract interface that supports the operations you expect callers to want to perform.