Get all lines containing a string in a huge text file - as fast as possible?

powershell find string in text file
powershell select-string and operator
powershell select-string recursive
select-string exclude
powershell grep recursive
powershell check if file contains string
powershell find line number of string in text file
powershell select-string regex

In Powershell, how to read and get as fast as possible the last line (or all the lines) which contains a specific string in a huge text file (about 200000 lines / 30 MBytes) ? I'm using :

get-content myfile.txt | select-string -pattern "my_string" -encoding ASCII | select -last 1

But it's very very long (about 16-18 seconds). I did tests without the last pipe "select -last 1", but it's the same time.

Is there a faster way to get the last occurence (or all occurences) of a specific string in huge file?

Perhaps it's the needed time ... Or it there any possiblity to read the file faster from the end as I want the last occurence? Thanks

Try this:

get-content myfile.txt -ReadCount 1000 |
 foreach { $_ -match "my_string" }

That will read your file in chunks of 1000 records at a time, and find the matches in each chunk. This gives you better performance because you aren't wasting a lot of cpu time on memory management, since there's only 1000 lines at a time in the pipeline.

Fast string search in a very large file, I have to search all of these strings in a larger file (about 100 million lines). If any line in this larger file contains the search string the line is printed. Net /» - to file base.txt .2. txt; containing «. org /» - to file base.txt.3.txt, and so on. If a line corresponding to several substrings or masks is found, then it will be placed only in the

Have you tried:

gc myfile.txt | % { if($_ -match "my_string") {write-host $_}}

Or, you can create a "grep"-like function:

function grep($f,$s) {
    gc $f | % {if($_ -match $s){write-host $_}}
    }

Then you can just issue: grep $myfile.txt $my_string

[SOLVED] Script to extract some lines from a large text file , I am trying to extract some lines from a huge text file using a powershell script. $log = get-content D:\scripts\iis.log foreach ($line in $log) { if ($line -contains Select-String will be much faster and easier to use. You could do something like this. This would show you all the lines that have the pattern on it, You didn't specify  I have a log file that is not more than 10KB (File size can go up to 2 MB max) and I want to find if atleast one group of these strings occurs in the files. These strings will be on different line

$reader = New-Object System.IO.StreamReader("myfile.txt")

$lines = @()

if ($reader -ne $null) {
    while (!$reader.EndOfStream) {
        $line = $reader.ReadLine()
        if ($line.Contains("my_string")) {
            $lines += $line
        }
    }
}

$lines | Select-Object -Last 1

Using PS to search a huge file for matching lines as fast as grep , Using PS to search a huge file for matching lines as fast as grep does. Question. I have the need to routinely search a huge unicode text file that's roughly 650 million lines $BigAssFile = "c:\file" $nlines = 0; select-string -pattern '^REGEX' -​path to do it all without needing to utilize an external application up to this point​. In Powershell, how to read and get as fast as possible the last line (or all the lines) which contains a specific string in a huge text file (about 200000 lines / 30 MBytes) ? I'm using : get-content myfile.txt | select-string -pattern "my_string" -encoding ASCII | select -last 1 But it's very very long (about 16-18 seconds).

Have you tried using [System.IO.File]::ReadAllLines();? This method is more "raw" than the PowerShell-esque method, since we're plugging directly into the Microsoft .NET Framework types.

$Lines = [System.IO.File]::ReadAllLines();
[Regex]::Matches($Lines, 'my_string_pattern');

Select-String, Assuming that your code produces the right results now. and it is a big That way you can search the 4MB window, and then start the next window at the This system will be fast because you will never have to copy the file's data in to Java. and see if the line contains that string (taking some edge cases into account). The first thing to try is to stream Get-Content and build up the line count one at a time, rather that storing all lines in an array at once. I think that this will give proper streaming behavior - i.e. the entire file will not be in memory at once, just the current line.

Fast way of searching for a string in a text file, text-processing If you have that case then split the exclusion file into chunks and run If you want "lines that contain a string found in another file" (and not "​lines that "grep -F" is not looking for regexp match but simple string match (​much faster) grep -vFf file1 file2 has no problems with big file1 files. The Select-String cmdlet searches for text and text patterns in input strings and files. You can use Select-String similar to grep in UNIX or findstr.exe in Windows. Select-String is based on lines of text. By default, Select-String finds the first match in each line and, for each match, it displays the file name, line number, and all text in the line containing the match. You can direct

Fastest & most efficient way to remove lines containing strings , For example, if you want to delete all lines containing the string poll from a log file​, in vi you'd You can issue such commands multiple times (issuing undo when you overdo it), If you're examining a large file with a logical block structure you facilities to quickly fold and unfold diverse parts and navigate between them. grep -n "my string" file_name will do for your particular query. GREP is by default case sensitive in nature and to make it case insensitive you can add -i option to it. The -n option displays the line numbers. For other myriad options, I recommend . man grep for more interesting pattern matching capability of GREP.

Effective Debugging: 66 Specific Ways to Debug Software and Systems, If the file's path is in a string bound to the thefilepath variable, that's just: If you have an external program that counts a file's lines, such as wc -l on it's generally simpler, faster, and more portable to do the line-counting in your program. You can rely on almost all text files having a reasonable size, so that reading the  when i run this instead of getting two lines i get only one whats even more strange is it only output the second line and not the first line so my result in this case is 210,Joseph,Alonzo,2900,23/10/2010 and if only 1 occurence had 23/10/2010 in the line then im getting empty results so I really need help with this one! Thanks

Comments
  • The reason there was no change whether you piped to "Select -last 1" or not is because the whole file has to be processed to know which is "last".
  • You may need to use .NET to have some performance there: Start reading massive text file from the end.
  • Is it possible to return nearby lines as well somehow?
  • Even with -ReadCount=1000, get-content still reads the entire into memory. I ran out of memory trying to parse a 40GB file. Any other ideas?
  • May be slow for big files, or even crash due to out of memory exception.
  • this user specifically said hes using large files, why would you post a solution that "can crash if used with large files"?
  • if i want regex to give whole line where pattern matches how to do that [Regex]::Matches($line, 'Database:'); it should give where it matches Database: but it should give databasename as well