Delete Duplicate Rows, Keep Last and Delete First
I'm trying to come up with code that look in Column D for any duplicate text then deletes the entire row that the first duplicate is located in. There are blanks in between the rows so using the code
.End(xl)Up doesn't work unless you're able to target the entire column regardless of the blanks in between
the rows somehow.
I've tried two methods so far but neither have worked to my expectation.
This was my first method which doesn't work since the worksheet has an outline:
Sub test() ActiveSheet.Range("D:D").RemoveDuplicates Columns:=1, header:=xlNo End Sub
This was my second method that I got from another site that runs for minutes but doesn't appear to do what I'm trying to achieve.
Sub Row_Dupe_Killer_Keep_Last() Dim lrow As Long For lrow = Cells(Rows.Count, "D").End(xlUp).Row To 2 Step -1 If Cells(lrow, "D") = Cells(lrow, "D").Offset(-1, 0) Then Cells(lrow, "D").Offset(-1, 0).EntireRow.Delete End If Next lrow End Sub
Does anyone have any suggestions or tips? I've been working at it for a few days now with my limited skills and haven't been able to figure a way to do it...Thank you in advance for your time.
Edit: Now ignores blanks
Edit: Modified to have the ability to change the starting row
What you might want to do is pull the data into an array and search the array for duplicates. Excel can process arrays much faster than it can go through each cell.
The following code does just that. It will leave D1 alone (such as in your example code) and will remove the entire row of any duplicates, only leaving the last item.
To deal with deleting the rows, we add all of the duplicates into a range object named rngDelete and delete all the rows at once. This will make it run much faster than deleting one by one.
Sub Row_Dupe_Killer_Keep_Last() Dim vData As Variant Dim rngDelete As Range Dim lrow As Long, lrowSearch As Long Dim lStartRow as long 'Change this to the row you wish to start with (the top row) lStartRow = 22 'Get all of the data from the cells into a variant array 'Normally I would prefer to use usedrange, but this method is fine '(Note: Change the 2 to 1 if you want to include the entire column including Row number 1) vData = Range(Cells(lStartRow, "D").Address & ":" & Cells(Rows.Count, "D").End(xlUp).Address) 'Search for duplicates 'First, loop through backwards one by one For lrow = UBound(vData) To LBound(vData) Step -1 'now loop through forwards (up to the point where we have already looked) For lrowSearch = LBound(vData) To lrow 'Check if we have a duplicate If Not IsError(vData(lrow, 1)) And Not IsError(vData(lrowSearch, 1)) Then If lrow <> lrowSearch And vData(lrow, 1) = vData(lrowSearch, 1) And vData(lrow, 1) <> "" Then 'We have a duplicate! Let's add it to our "list to delete" If rngDelete Is Nothing Then 'if rngDelete isn't set yet... Set rngDelete = Range("D" & lrowSearch + lStartRow-1) Else 'if we are adding to rngDelete... Set rngDelete = Union(rngDelete, Range("D" & lrowSearch + lStartRow-1)) End If End If End If Next lrowSearch Next lrow 'Delete all of the duplicate rows If Not rngDelete Is Nothing Then rngDelete.EntireRow.Delete End If End Sub
VBA, VBA - Delete Duplicate Rows Keep Last Entry j As Long Dim ROW_DELETED As Boolean i = 1 'start on first row Do While i <= ActiveSheet. The following code does just that. It will leave D1 alone (such as in your example code) and will remove the entire row of any duplicates, only leaving the last item. To deal with deleting the rows, we add all of the duplicates into a range object named rngDelete and delete all the rows at once.
This should help you out.
Option Explicit Const c_intMaxBlanks As Integer = 5 Const c_AbsoluteMaxRowsInSheet As Integer = 5000 Public Sub RunIt() Row_Dupe_Killer_Keep_Last ActiveSheet.Range("D:D") End Sub Public Sub Row_Dupe_Killer_Keep_Last(rngCells As Range) Dim iRow As Integer, iCol As Integer Dim intBlankCnt As Integer Dim intMaxBlanks As Integer Dim blnIsDone As Boolean Dim intSaveStartRow As Integer Dim blnStartCnt As Boolean Dim strTemp As String Dim strCheck As String Dim intI As Integer Dim intJ As Integer Dim intSaveEndRow As Integer 'First, Count the consecutive blanks blnIsDone = False blnStartCnt = False intSaveStartRow = 0 intSaveEndRow = 0 intBlankCnt = 0 iRow = 1 iCol = rngCells.Column Do While (Not blnIsDone) 'Check for blank Row using length of string If (Len(Trim(rngCells.Cells(iRow, 1).Value)) < 1) Then If Not blnStartCnt Then intSaveStartRow = iRow blnStartCnt = True Else If (intSaveStartRow + intBlankCnt) <> iRow Then 'restart intSaveStartRow = iRow intBlankCnt = 0 End If End If intBlankCnt = intBlankCnt + 1 Else 'restart blnStartCnt = False intBlankCnt = 0 End If intSaveEndRow = iRow If intBlankCnt >= c_intMaxBlanks Then blnIsDone = True 'Stop Loop: Maybe Infinite" If iRow > c_AbsoluteMaxRowsInSheet Then Exit Do iRow = iRow + 1 Loop 'Now, loop through each row in the column and check values. For intI = intSaveEndRow To 2 Step -1 strTemp = LCase(Trim(rngCells.Cells(intI, 1).Value)) For intJ = intSaveEndRow To 2 Step -1 If intJ <> intI Then strCheck = LCase(Trim(rngCells.Cells(intJ, 1).Value)) If strTemp = strCheck Then 'Found a dup, delete it rngCells.Cells(intJ, 1).EntireRow.Delete 'ElseIf Len(strCheck) < 1 Then ' 'Delete the blank line ' rngCells.Cells(intJ, 1).EntireRow.Delete End If End If Next intJ Next intI End Sub
Delete Duplicate Entries, Keep Last And Delete First, I found this sample code that works from top to bottom of a spreadsheet. But I need something that will delete the first entry and keep the last At the first instance, the solution seems to be simple. In Power Query, you would think that you simply: Sort the table by Order Date in descending order. Select the customer key column and then remove duplicates. But when you do this in Power Query, it does not work as expected.
This method avoids the use of
EntireRow.Delete, which is notoriously slow. The contents are cleared and the dataset is sorted to remove gaps.
EDIT: switched to For Next to enable searching upwards from the bottom; also cleaned up sort routine generated by the macro recorder...I never seem to have that routine on-hand when I need it :).
Note: this will also not work with an outline...whatever you did to make it work for the other answer will need to be done for this one as well.
I am curious if the Clear/Sort approach works for you and if it speeds up your routine.
Option Explicit Sub RemoveFirstDuplicate() Dim myDataRange As Range, iCounter As Long, myDuplicate As Range, lastRow As Long lastRow = Range("D1000000").End(xlUp).Row Set myDataRange = Sheets("Sheet1").Range("D1:D" & lastRow) 'searching up to the second row (below the field name assumed to be in row 1)...you may need to adjust where the loop stops For iCounter = myDataRange.Cells.Count To 2 Step -1 With myDataRange If WorksheetFunction.CountIf(myDataRange, myDataRange.Item(iCounter)) > 1 Then Set myDuplicate = .Find(What:=myDataRange.Item(iCounter), After:=myDataRange.Item(iCounter), SearchDirection:=xlPrevious) Range("D" & myDuplicate.Row).ClearContents End If End With Next iCounter With ActiveWorkbook.Worksheets("Sheet1").Sort .SortFields.Clear .SortFields.Add Key:=myDataRange.Offset(1, 0) .SetRange myDataRange .Header = xlYes .Apply End With End Sub
Solved: Remove duplicates - keep last vs keep first, Solved: Hi, I need to remove rows in a table that have duplicate values and I need to keep last one. Power BI by default keeps first one Select the data range that you want to remove duplicates and keep the most recent date from another column. 2. Then click Kutools > Merge & Split > Advanced Combine Rows, see screenshot: 3.
Remove duplicate rows, and keep newest row based on date , The trick is to sort your table before using Remove duplicates . Excel always keeps the first data set of a duplicated row. All consecutive rows 6 (delete this row) 6 (keep this row) 7 8 (delete this row) 8 (delete this row) 8 (delete this row) 8 (delete this row) 8 (keep this row) 9 10. I want to delete the rows that have duplicate values, but the kicker is I need to keep the last row with the duplicated value. So I want the second (or last if multiples) record. I tried this piece of
Mr Excel & excelisfun Trick 90: Remove Earlier Duplicates, Keep , Filter for unique values, remove duplicate values, and conditionally format Keep in touch and stay productive with Teams and Microsoft 365, even when you're Some rows have one duplicate, some have many. SELECT CONCAT(k1, k2, k) AS dup_value FROM myviews GROUP BY dup_value HAVING (COUNT(dup_value) > 1) shows me duplicates values that I need to deal with. But now I don't know how to keep one and delete the rest of each duplicate set.
Filter for unique values or remove duplicate values, In some cases, remove duplicate doesn't help you to keep the last of the range is the cell containing the email field on the first row of the SQL delete duplicate Rows using Common Table Expressions (CTE) We can use Common Table Expressions commonly known as CTE to remove duplicate rows in SQL Server. It is available starting from SQL Server 2005. We use a SQL ROW_NUMBER function, and it adds a unique sequential row number for the row.