How to find the average time from log file for a specific pattern matched string

extract data from log file in specified range of time awk
how to analyse log files in linux
grep command
how to check logs in linux by date
find and grep command in linux with example
sed
awk command in unix
grep regex

I have logs with the following pattern, I need a shell script which can grep the pattern Forged new block id and calculate the average time from the timestamp column for the matching pattern Forged new block id.

[inf] 2018-06-01 07:32:20 | Forged new block id: 17422268043238265953 height: 6 round: 1 slot: 6372914 reward: 0
[inf] 2018-06-01 07:32:30 | Forged new block id: 12637471709620273874 height: 7 round: 1 slot: 6372915 reward: 0
[inf] 2018-06-01 07:33:31 | Forged new block id: 9854455515089974346 height: 13 round: 1 slot: 6372921 reward: 0
[inf] 2018-06-01 07:35:34 | Forged new block id: 9528299565814967922 height: 25 round: 1 slot: 6372933 reward: 0
[inf] 2018-06-01 07:37:44 | Forged new block id: 4030154355419311374 height: 38 round: 1 slot: 6372946 reward: 0
[inf] 2018-06-01 07:38:34 | Forged new block id: 15961811681976216620 height: 43 round: 1 slot: 6372951 reward: 0
[inf] 2018-06-01 07:39:03 | Forged new block id: 18327854550540255433 height: 46 round: 1 slot: 6372954 reward: 0
[inf] 2018-06-01 07:43:05 | Forged new block id: 6436183970195006511 height: 70 round: 1 slot: 6372978 reward: 0
[inf] 2018-06-01 07:44:34 | Forged new block id: 2865139280099855691 height: 79 round: 1 slot: 6372987 reward: 0
[inf] 2018-06-01 07:45:45 | Forged new block id: 5462796790425133759 height: 86 round: 1 slot: 6372994 reward: 0

The expected result is the time difference between current row and its previous row lets say 2018-06-01 07:32:20 this is the first row and this is the second row 2018-06-01 07:32:30 so the time difference is 10 seconds and the average would be the sum of all the difference and divided by total rows

You could use this GNU awk script:

$ awk '/Forged new block id/{timestr=$2" "$3;gsub(/[-:]/," ",timestr);t1=mktime(timestr);if(t2) {diff=t1-t2;print diff} t2=t1; total+=diff} END{print "average=" total/NR}' file
10
61
123
130
50
29
242
89
71
average=80.5

The script converts the 2nd and 3th field into a date string that mktime awk command understand.

The timestamp difference is calculated based on the last line.

The END prints the average in seconds.

Analyzing Linux Logs - The Ultimate Guide To Logging, To perform a simple search, enter your search string followed by the file you A regular expression (or regex) is a syntax for finding certain text patterns within a file. In this case, it matched an Apache log that happened to have 4792 in the command line tool that can display the latest changes from a file in real time. Hi, I am using "Log File Pattern Matched Line Count" one of my application's log files and I am sending the a notification to the users when ever there is a match. But its not working as per my expectations.

You may use this awk to calculate average time:

awk 'function getSec(s) {
   cmd = "date +%s -d \"" s "\""
   cmd | getline d
   close( cmd )
   return d
}
/Forged new block id/ {
   ++n
   ts = $2 OFS $3
   if (n == 1)
      start = ts
}
END {
   print (getSec(ts) - getSec(start)) / n
}' file

80.5

Online Working Demo

15 Practical Grep Command Examples In Linux / UNIX, Photo courtesy of Alexôme's You should get a grip on the Linux grep This searches for the given string/pattern case insensitively. The preceding item will be matched zero or more times. Display only the file names which matches the given pattern using grep -l I have a log file with identified errors. A pattern is a string or list of newline-delimited strings. File and directory names are compared to patterns to include (or sometimes exclude) them in a task. You can build up complex behavior by stacking multiple patterns. See fnmatch for a full syntax guide. Match characters. Most characters are used as exact matches. What counts as an

Just convert the times into timestamps and do the math on them.

eg. turn human to epoch (eg. "Apr 28 07:50:01" to 1524916201) and then subtract them add them or whatever.

In this example I have a function that will go from "human" readable if you will to timestamp and back. I found this useful when I wrote a script to tell me when crons started and when they actually completed.

Example: Convert from human to epoch function

#!/bin/bash
# -- Converts from human to epoch or epoch to human, 
#    specifically this format: "Apr 28 07:50:01" human.
#    typeset now=$(date +"%s")
#    typeset now_human_date=$(convert_cron_time "human" "$now")

function convert_cron_time() {
    case "${1,,}" in
        epoch)
            # human to epoch (eg. "Apr 28 07:50:01" to 1524916201)
            echo $(date -d "${2}" +"%s")
            ;;
        human)
            # epoch to human (eg. 1524916201 to "Apr 28 07:50:01")
            echo $(date -d "@${2}" +"%b %d %H:%M:%S")
            ;;
    esac
}


now=$(date +"%s")
time_range=604800   #one week in seconds
start=$now
finish=$((($now+$time_range)))


start_human=$(convert_cron_time "human" "$start")
finish_human=$(convert_cron_time "human" "$finish")

start_epoch=$(convert_cron_time "epoch" "$start_human")
finish_epoch=$(convert_cron_time "epoch" "$finish_human")

echo "$start_human / $start_epoch"
echo "$finish_human / $finish_epoch"

another=$(convert_cron_time "epoch" "2018-06-01 07:32:20")
another_human=$(convert_cron_time "human" "$another")
echo "another ex: $another / $another_human"

Output:

Jun 01 08:47:34 / 1527857254
Jun 08 08:47:34 / 1528462054
another ex: 1527852740 / Jun 01 07:32:20

Hence, in a for loop you can add all of the fields up or do comparisons as you are reading the file. In your case you would subtract the timestamps and take the difference in seconds and do something with them (e.g. convert to mins/hours etc.)

How to get text from range of dates using grep/sed in large text file?, Are you sure you mean June? With grep if you know the number of lines you want you can use file filter-log-dates.awk in the current working directory and the log file is If your time stamp format is different, you can adjust the regular Find the number of the first line matching your starting pattern. how to grep for specific time period in a log. We work with log files that have line entries that start with a date/time format such as this: The pattern

[PDF] grep, awk and sed – three VERY useful command-line utilities Matt , In this example, grep would loop through every line of the file "a_file" and print out all of the lines that do not match the search string, rather than printing the lines 'pattern-matching' commands can contain regular expressions as for grep​. be straightforwards to write an awk command that would calculate the mean and. The first command uses the Get-ChildItem cmdlet (similar to dir or ls) to find the text file. The pipeline operator (|) sends the output to the next command. get-childitem file.txt. The second command uses the Select-String cmdlet to search for the regular expression in the File.txt file. The Pattern parameter (-pattern) specifies the regular expression.

Using Grep & Regular Expressions to Search for Text Patterns in Linux, Grep is a tool used to search for specified patterns within text input In its simpest form, grep can be used to match literal patterns within a text file. This string example will only mach "GNU" if it occurs at the very beginning of a line. is used in regular expressions to mean that any single character can  To exclude files containing a specific string, use “grep” with the “-v” option. $ grep -v <expression> <file|path> As a little example, let’s say that you have three files but two of them contain the word “log”. In order to exclude those files, you would have to perform an invert match with the “-v” option.

Do you have any useful awk and grep scripts for parsing apache logs?, You can do pretty much anything with apache log files with awk alone. The only time this breaks down is if you have the combined log format and are You can also pipe the output through sort to get the results in order, either as part of the (minute or hour) for a given pattern (ip address or cgi string or parameters, etc)  2 Answers 2. grep searches the named input FILEs (or standard input if no files are named, or if a single hyphen-minus (-) is given as file name) for lines containing a match to the given PATTERN. Find all lines matching a specific keyword on a file. To search for a particular IP mentioned among many IPs.

Comments
  • what should be the expected result, a timestamp or datetime string?
  • is shell script mandatory? shell is not best for file parsing; use some scripting language like Python, perl, etc; I would go with Python as personal pref, especially because its by default on every nix system so no pain.
  • Python, Javascript script should also be fine
  • what the 3rd record should be compared with? with the 2nd record's initial datetime OR with the difference of the 2nd and 1st records?
  • Don't just describe the expected output, show us the expected output given the sample input you provided.
  • That's GNU awk only.
  • @Manu You indeed need GNU awk (that might not be the default on your mac). brew install gawk might help.
  • I tried installing gawk, now i am getting error calling undefined function mktime Note: I am using mac
  • @Manu it is likely that you still use the default awk. Use gawk instead.
  • I think it throws an error $ awk 'function getSec(s) { cmd = "date +%s -d \"" s "\"" cmd | getline d close( cmd ) return d } /Forged new block id/ { ++n ts = $2 OFS $3 if (n == 1) start = ts } END { print (getSec(ts) - getSec(start)) / n }' test/test.txt date: illegal time format usage: date [-jnRu] [-d dst] [-r seconds] [-t west] [-v[+|-]val[ymwdHMS]]
  • Here is an online working demo of above code
  • The code relies on a date command that supports %s output format and allows you to provide a date to be converted but that's all non-POSIX so YMMV.