I want to store informations in ArrayList. I am taking data from csv file but there are same data and i want to eliminate them. What is the most efficient way to do that ? I considered two ways: Add all data to Set and convert it to ArrayList. Add them to ArrayList while checking it does not contain the same data. Here is my code:

public static void sanitization(String file_path) throws FileNotFoundException, IOException {

    File file = new File(file_path);
    BufferedReader reader = new BufferedReader(new FileReader(file)); //read the csv file

    Set<Flight> flights_set = new HashSet<>(); //All valid flights will be added to set in order to prevent from adding same flights.

    String[] split = new String[31];
    String st;

    while ((st = reader.readLine()) != null) {
        split = st.split(",", -2);
        flights_set.add(new Flight(split[4], split[5], Integer.valueOf(split[11]), split[7], split[8], Integer.valueOf(split[0]), Integer.valueOf(split[1]), Integer.valueOf(split[2])));

    //Second possible way
    /*while ((st = reader.readLine()) != null) {
        split = st.split(",", -2);
        Flight f=new Flight(split[4], split[5], Integer.valueOf(split[11]), split[7], split[8], Integer.valueOf(split[0]), Integer.valueOf(split[1]), Integer.valueOf(split[2]));


    ArrayList<Flight> flights_arraylist = new ArrayList<>(flights_set);


class Flight implements Comparable<Flight> {

//All necessary information
public String airline;
public String flight_number;
public Integer departure_delay;
public String origin_airport_name;
public String destination_airport_name;
public Integer year;
public Integer month;
public Integer day;

public Flight(String airline, String flight_number, Integer departure_delay, String origin_airport_name, String destination_airport_name, Integer year, Integer month, Integer day) {
    this.airline = airline;
    this.flight_number = flight_number;
    this.departure_delay = departure_delay;
    this.origin_airport_name = origin_airport_name;
    this.destination_airport_name = destination_airport_name;
    this.year = year;
    this.month = month; = day;

public Flight() {


//Flight is bigger if its departure delay is bigger
public int compareTo(Flight o) {
    if (this.departure_delay > o.departure_delay) return 1;
    else if (this.departure_delay < o.departure_delay) return -1;
    else return 0;

public boolean equals(Object obj) {
    Flight f = (Flight) obj;

    if ((this.airline.equals(f.airline)) && (this.flight_number.equals(f.flight_number)) && (this.departure_delay.equals(f.departure_delay)) && (this.origin_airport_name.equals(f.origin_airport_name)) && (this.destination_airport_name.equals(f.destination_airport_name)) && (this.year.equals(f.year)) && (this.month.equals(f.month)) && ( {
        return true;
    return false;


public int hashCode() {
    return 0;

public String toString() {
    return this.airline + " " + this.flight_number + " " + this.departure_delay;


This is also my first question please warn me if i made any mistake

You can use streams, Below is the sample way of doing it for lists.

First add all elements to list and then use stream and collect the distinct elements and update in same list.


List<String> strList = new ArrayList<String>();

strList =;
System.out.println("Without duplicate");


Without duplicate

From javadoc of java.util.Set#add: @return true if this set did not already contain the specified element. Additionally, for this answer note that BufferedReader provides lines method which returns Stream of Strings in file. Knowing this you to could write something like this:

    List<Flight> result;//list of your choice;
    Set<Flight> flightSet; //set of your choice;
    BufferedReader reader; // init bufferedReader
            .forEach(line -> {
                Flight flight;//transform into object;
                if (flightSet.add(flight)) {

or, fully using streams, collect distinct mapping lines:

BufferedReader reader; // init bufferedReader
            .map(line->new Flight(/*... args*/))

To avoid duplicates eventually you need to search in the available data.

On average, the HashSet.contains() runs in O(1) time.

However, Internally, ArrayList uses the indexOf(object) method to check if the object is in the list. The indexOf(object) method iterates the entire array and compares each element with the equals(object) method.

Getting back to complexity analysis, the ArrayList.contains() method requires O(n) time.

It is most efficient to use SET to store without duplicates and then convert it to List.

  • you can use Set<String> , Is there any necessity to use ArrayList ?
  • It would be good form to have your Flight objects compute proper hashcodes. Your Java IDE might be able to generate a suitable hashCode() method for you, or you can use Objects.hash to ease the task of writing one yourself. HashSets and HashMaps can potentially suffer from degraded performance when all the objects in them have the same hashcode.
  • It's a very good approach to just use a Set instead of an ArrayList. The only thing to consider is whether insertion-order is required for the collection members.
  • @MdFaraz I will sort them later.