Most efficient way to eliminate repeating entries

formula to automatically remove duplicates in excel
excel remove duplicates keep first
how to remove duplicates in excel without shifting cells
remove duplicates from list
excel remove duplicates not working
remove duplicates excel
excel remove duplicate rows based on one column
shortcut to remove duplicates in excel

I want to store informations in ArrayList. I am taking data from csv file but there are same data and i want to eliminate them. What is the most efficient way to do that ? I considered two ways: Add all data to Set and convert it to ArrayList. Add them to ArrayList while checking it does not contain the same data. Here is my code:

public static void sanitization(String file_path) throws FileNotFoundException, IOException {

    File file = new File(file_path);
    BufferedReader reader = new BufferedReader(new FileReader(file)); //read the csv file

    Set<Flight> flights_set = new HashSet<>(); //All valid flights will be added to set in order to prevent from adding same flights.

    String[] split = new String[31];
    String st;

    while ((st = reader.readLine()) != null) {
        split = st.split(",", -2);
        flights_set.add(new Flight(split[4], split[5], Integer.valueOf(split[11]), split[7], split[8], Integer.valueOf(split[0]), Integer.valueOf(split[1]), Integer.valueOf(split[2])));
    }

    //Second possible way
    /*while ((st = reader.readLine()) != null) {
        split = st.split(",", -2);
        Flight f=new Flight(split[4], split[5], Integer.valueOf(split[11]), split[7], split[8], Integer.valueOf(split[0]), Integer.valueOf(split[1]), Integer.valueOf(split[2]));

        if(!flights_arraylist.contains(f))
            flights_arraylist.add(f);
    }*/

    ArrayList<Flight> flights_arraylist = new ArrayList<>(flights_set);

}

class Flight implements Comparable<Flight> {

//All necessary information
public String airline;
public String flight_number;
public Integer departure_delay;
public String origin_airport_name;
public String destination_airport_name;
public Integer year;
public Integer month;
public Integer day;

//Constructor
public Flight(String airline, String flight_number, Integer departure_delay, String origin_airport_name, String destination_airport_name, Integer year, Integer month, Integer day) {
    this.airline = airline;
    this.flight_number = flight_number;
    this.departure_delay = departure_delay;
    this.origin_airport_name = origin_airport_name;
    this.destination_airport_name = destination_airport_name;
    this.year = year;
    this.month = month;
    this.day = day;
}

public Flight() {

}

//Flight is bigger if its departure delay is bigger
public int compareTo(Flight o) {
    if (this.departure_delay > o.departure_delay) return 1;
    else if (this.departure_delay < o.departure_delay) return -1;
    else return 0;
}

@Override
public boolean equals(Object obj) {
    Flight f = (Flight) obj;

    if ((this.airline.equals(f.airline)) && (this.flight_number.equals(f.flight_number)) && (this.departure_delay.equals(f.departure_delay)) && (this.origin_airport_name.equals(f.origin_airport_name)) && (this.destination_airport_name.equals(f.destination_airport_name)) && (this.year.equals(f.year)) && (this.month.equals(f.month)) && (this.day.equals(f.day))) {
        return true;
    }
    return false;

}

@Override
public int hashCode() {
    return 0;
}

@Override
public String toString() {
    return this.airline + " " + this.flight_number + " " + this.departure_delay;
}

}

This is also my first question please warn me if i made any mistake

You can use streams, Below is the sample way of doing it for lists.

First add all elements to list and then use stream and collect the distinct elements and update in same list.

Example:

List<String> strList = new ArrayList<String>();
strList.add("Alpha");
strList.add("Beta");
strList.add("Charlie");
strList.add("Delta");
strList.add("Delta");
strList.add("Delta");

strList = strList.stream().distinct().collect(Collectors.toList());
System.out.println("Without duplicate");
strList.forEach(System.out::println);

output:

Without duplicate
Alpha
Beta
Charlie
Delta

7 Ways To Find And Remove Duplicate Values In Microsoft Excel , Data-Tab-Remove-Duplicates 7 your data so it's best to perform� SQL delete duplicate Rows using Group By and having clause In this method, we use the SQL GROUP BY clause to identify the duplicate rows. The Group By clause groups data as per the defined columns and we can use the COUNT function to check the occurrence of a row.

From javadoc of java.util.Set#add: @return true if this set did not already contain the specified element. Additionally, for this answer note that BufferedReader provides lines method which returns Stream of Strings in file. Knowing this you to could write something like this:

    List<Flight> result;//list of your choice;
    Set<Flight> flightSet; //set of your choice;
    BufferedReader reader; // init bufferedReader
    reader.lines()
            .forEach(line -> {
                Flight flight;//transform into object;
                if (flightSet.add(flight)) {
                    result.add(flight);
                }
            });

or, fully using streams, collect distinct mapping lines:

BufferedReader reader; // init bufferedReader
reader.lines()
            .map(line->new Flight(/*... args*/))
            .distinct()
            .collect(Collectors.toList())

Easiest/most efficient way to delete duplicates from one list at the , You can accomplish this using DeleteDuplicatesBy , by first taking your two input lists and making a matrix out of them, and then deleting the� Removing Duplicates by Self-referencing Method. We can remove the duplicates using the same method we used to find duplicates with the exception of using DELETE in line with its syntax as follows: USE UniversityV2 -- Removing duplicates by using Self-Referencing method DELETE S2 FROM [dbo]. [Student] S1, [dbo].

To avoid duplicates eventually you need to search in the available data.

On average, the HashSet.contains() runs in O(1) time.

However, Internally, ArrayList uses the indexOf(object) method to check if the object is in the list. The indexOf(object) method iterates the entire array and compares each element with the equals(object) method.

Getting back to complexity analysis, the ArrayList.contains() method requires O(n) time.

It is most efficient to use SET to store without duplicates and then convert it to List.

Python - Ways to remove duplicates from list, Remove duplicates from list operation has large number of applications and hence, This is the most popular way by which the duplicated are removed from the list. This is fastest method to achieve the particular task. Method 1: (Using extra space) Create an auxiliary array temp [] to store unique elements. Traverse input array and one by one copy unique elements of arr [] to temp []. Also keep track of count of unique elements.

How to Remove Duplicate Entries in Excel, Open an Excel workbook, or create a new one if you want to follow along. Select a column (or columns) to look for duplicated data. Open the Data tab at the top of the ribbon. Find the Data Tools menu, and click Remove Duplicates. Press the OK button on the pop-up to remove duplicate items from your data set. To avoid duplicates eventually you need to search in the available data. On average, the HashSet.contains() runs in O(1) time.. However, Internally, ArrayList uses the indexOf(object) method to check if the object is in the list.

Remove Duplicate Rows in Excel, Finally, we show how to remove duplicate rows using Excel Formulas. Note that the methods described keep the first occurrence of each row, but delete any� Baking soda and sugar ant trap. Baking soda is another natural remedy which is toxic to ants but safe enough to use around the home. It is one of the many reasons to keep baking soda in your kitchen. You should mix baking soda with sugar to help attract the ants to your trap.

How to Remove Duplicates in Google Sheets in Five Different Ways, So it's best that you deal with them head on. Method 1: Remove Duplicates tool is the most robust, practical method of removing duplicates. Remove duplicate rows in a data frame. The function distinct() [dplyr package] can be used to keep only unique/distinct rows from a data frame. If there are duplicate rows, only the first row is preserved. It’s an efficient version of the R base function unique(). Remove duplicate rows based on all columns: my_data %>% distinct()

Comments
  • you can use Set<String> , Is there any necessity to use ArrayList ?
  • It would be good form to have your Flight objects compute proper hashcodes. Your Java IDE might be able to generate a suitable hashCode() method for you, or you can use Objects.hash to ease the task of writing one yourself. HashSets and HashMaps can potentially suffer from degraded performance when all the objects in them have the same hashcode.
  • It's a very good approach to just use a Set instead of an ArrayList. The only thing to consider is whether insertion-order is required for the collection members.
  • @MdFaraz I will sort them later.