Removing specific values when a certain value is before

delete rows based on cell value vba
excel remove specific text from cell
vba delete cells with specific value
delete rows in excel based on condition/criteria
conditionally delete cells in excel
delete all values less than a certain number in excel
excel delete rows with specific text mac
delete excel rows based on another list

I have a.csv file that has values that will distort my calculation

Approach

I want to remove values that exits on rows after specific values in each row, for example, if there is a "(B)" on a row and it's the first before "(D)" and others "(B)" only keep the first "(B)"

And same for "+", "++" and "+++", i want to keep only the first one in each line

Desired result

1277|2013-12-17 16:00:00|100|+|
1360|2014-01-15 16:00:00|(B)|99|++|E
1402|2014-02-05 20:00:00|(D)|99|++|D
1360|2014-01-29 08:00:00|(D)|99|C
1378|2014-01-21 20:00:00|(B)|100||D

Sample of the csv file :

1277|2013-12-17 16:00:00|100|+|++|
1360|2014-01-15 16:00:00|(B)|(D)|99|++|+++||+|E
1402|2014-02-05 20:00:00|(D)|(B)|99|++|+||D
1360|2014-01-29 08:00:00|(D)|(B)|99||C
1378|2014-01-21 20:00:00|(B)|100||D

Here's a short little program that takes a list of tuples invalid_together and removes values as you described in the question. It just iterates through the data, and once it finds a value in an invalid grouping, it removes all following values in that group

import csv

invalid_together = [
    ('+', '++', '+++'),
    ('(A)', '(B)', '(C)', '(D)')
]

removeAll = ['']

with open('data.csv', 'rt') as dataIn:
    with open('new_data.csv', 'w') as dataOut:
        reader = csv.reader(dataIn, delimiter="|")
        writer = csv.writer(dataOut, delimiter="|")
        for row in reader:
            for invalidGroup in invalid_together:
                foundInvalid = False
                offset = 0
                for index in range(0, len(row)):
                    item = row[index - offset]
                    if item in invalidGroup and not(foundInvalid):
                        foundInvalid = True
                    elif (item in invalidGroup and foundInvalid) or (item in removeAll):
                        row.pop(index - offset)
                        offset += 1
            writer.writerow(row)

Delete Rows Based on a Cell Value (or Condition) in Excel [Easy , One of the fastest ways to delete rows that contain a specific value or fulfill a data based on a cell value (or can be other condition such as after/before a date or only when you want to delete the cells with the values and not the entire rows. In Excel, you can apply the powerful Find and Replace feature to remove rows based on a certain cell value easily. Please do as follows: 1. Select the range where you will remove rows based on certain cell value, and open the Find and Replace dialog box with pressing the Ctrl + F keys simultaneously. 2.

You can use the built-in csv module to read one CSV, then filter each of its rows to not include repeated elements of the same category and finally write everything down as a new CSV. First create a category filter:

categories = [  # make a list of tuples containing elements that should appear only once
    ("(B)", "(D)"),
    ("+", "++", "+++")
]

categories_map = {e: c[0] for c in categories for e in c}  # turn it into a quick lookup map

def filter_elements(row):  # and then build your filters
    unique = set()  # a set to hold our unique values
    for column in row:
        if column in categories_map:
            if categories_map[column] not in unique:
                unique.add(categories_map[column])
                yield column
        elif column:  # use `else:` instead if you want to keep the empty fields
            yield column

Finally, open your input CSV, read it, filter its rows and immediately write it to the output CSV:

with open("in.csv", "r", newline="") as f_in, open("out.csv", "w", newline="") as f_out:
    writer = csv.writer(f_out, delimiter="|")  # create a CSV writer
    for row in csv.reader(f_in, delimiter="|"):  # iterate over a CSV reader
        writer.writerow(c for c in filter_elements(row))  # filter + write to the out.csv

For your posted example data, this will produce out.csv containing:

1277|2013-12-17 16:00:00|100|+
1360|2014-01-15 16:00:00|(B)|99|++|E
1402|2014-02-05 20:00:00|(D)|99|++|D
1360|2014-01-29 08:00:00|(D)|99|C
1378|2014-01-21 20:00:00|(B)|100|D

How to remove texts before or after a specific character from cells in , I need to capture only ABCD-5008 and XYZ-5010. But when I use =LEFT(B5,​FIND("/",B5)-1) it gives me #VALUE!. Please advise me how  I have a.csv file that has values that will distort my calculation Approach I want to remove values that exits on rows after specific values in each row, for example, if there is a "(B)" on a row

You can use regular expression to extract desired parts:

import re

pattern = re.compile(r'^(.* \d+:\d+:\d+(?=\|))\|(\(\S\)(?=|))?.*?(\d+)\|(\++)?.*?\|(\S)?$')

with open('data.csv', 'r') as infile:
    with open('result.csv', 'w')  as outfile:
        for line in infile:
            outfile.write('|'.join(str(x) for x in pattern.match(line).groups() if x) + '\n')

This will result in:

1277|2013-12-17 16:00:00|100|+
1360|2014-01-15 16:00:00|(B)|99|++|E
1402|2014-02-05 20:00:00|(D)|99|++|D
1360|2014-01-29 08:00:00|(D)|99|C
1378|2014-01-21 20:00:00|(B)|100|D

If you want to post-process the output, probably it would be better to keep constant amount of elements per line instead of skipping elements which were empty. In order to do that, you can replace the last line with:

outfile.write('|'.join(str(x) for x in pattern.match(line).groups()) + '\n')

which will give as an output:

1277|2013-12-17 16:00:00|None|100|+|None
1360|2014-01-15 16:00:00|(B)|99|++|E
1402|2014-02-05 20:00:00|(D)|99|++|D
1360|2014-01-29 08:00:00|(D)|99|None|C
1378|2014-01-21 20:00:00|(B)|100|None|D

Edit:

In order to catch lines like:

325|2014-01-18 20:00:00|(B)|93|++|+||Calme 

the pattern can be modified to:

pattern = re.compile(r'^(.* \d+:\d+:\d+(?=\|))\|(\(\S\)(?=|))?.*?(\d+)\|(\++)?.*\|(\S+)?\s*?$')

You can quickly verify it here, also for the rest of the lines on which it initially failed.

How to remove rows based on cell value in Excel?, (1) In the Find values in box, please select the Note: If two specified columns contain the used before; Encrypt Cells with password; Create Mailing List and send emails. For removing all texts before or after a specific character with the Find and Replace function, please do as follows. 1. Select the cells you will remove texts before or after a specific character, press Ctrl + H keys to open the Find and Replace dialog.

Delete certain rows of a matrix based on specific values, I have a matrix that has 2 columns and thousands of rows. I need to delete the rows based on the following conditions: 1. if a value of  This post will guide you how to remove a specific character from text cell in Excel. How do I remove certain character from a cell with a formula in Excel 2013/2016. Assuming that you have a list of data in range B1:B5, in which contain text string values. And you want to remove a specific character if it appears in a given cell.

In Power Query, can I remove rows based on a particular value in a , What are some simple steps I can take to protect my privacy online? contain), for numeric (equals, less than, greater than ) for the date type (before, after ) Access the drop down filter on the column you want to remove the values and then  Problem Today, one of the developers come to me and asked me the question that is there any T-SQL function that he could use to remove everything before and after a specific character in string. For example, if the table contains the full names of the people in the format as firstname comma surname (Farooq,Basit).…

Filter for unique values or remove duplicate values, In Excel, there are several ways to filter for unique values—or remove duplicate values: To filter for unique values, click Data > Sort & Filter > Advanced. The Sort​  Hi, i would call this a rather bitter solution for a beginner to understand, but it is short and it works, if you want to understand it you should first look what a<100 does when "a" is a matrix (you shouldnt use this for if statements) and then combind it with &

Comments
  • Your output looks inconsistent - on the second through fourth lines you remove the empty elements, but you don't remove them on the last line. What should be the rule about that?
  • The rule will be to delete the empty elements because it's useless
  • On stackoverflow, you're supposed to show us what you've tried to do first.
  • @MaximeChéramy I corrected the post as you suggested
  • i have tried the program it retunrnig me an error : line 92, in <module> for row in reader: _csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)
  • The encoding of your file must be different from mine. Try opening the CSVs with 'rt' and 'wt' to not use bytes
  • it worked well for the letters but not for the "+" values
  • Sorry but I'd need more information to help. I copied your example into a csv and ran this fine in python 2.7, so I need to know a bit more
  • The program works but the values "+", "++" and "+++" are still followed by one of the values "+", "++" or "+++" like this : 3030|2015-09-22 12:00:00|++|+++| and i'm using python 3.6
  • It worked for the letters but it kept the following '+', '++ or '+++' for 28 rows on 650000 rows
  • can you provide here one (or more) of the rows on which it had failed?
  • yes of course here is the an example of the rows that failed 325|2014-01-18 20:00:00|(B)|93|++|+||Calme
  • Check the edit. This pattern will catch all provided examples.