How to extract a specific text from gz file?

gzip
gunzip
unzip gz file
zgrep
how to display specific lines from a file in linux
read first lines of gz file
zcat
zcat command with grep

I need to extract the 5 to 11 characters from my fastq.gz data this data is just too large for running in R. So I was wondering if I can do it directly in Linux command line? The fastq file looks like this:

@NB501399:67:HFKTCBGX5:1:11101:13202:1044 1:N:0:CTTGTA
GAGGTNACGGAGTGGGTGTGTGCAGGGCCTGGTGGGAATGGGGAGACCCGTGGACAGAGCTTGTTAGAGTGTCCTAGAGCCAGGGGGAACTCCAGGCAGGGCAAATTGGGCCCTGGATGTTGAGAAGCTGGGTAACAAGTACTGAGAGAAC
+
    AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAAEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAE6

@NB501399:67:HFKTCBGX5:1:11101:1109:1044 1:N:0:CTTGTA
TAGGCNACCTGGTGGTCCCCCGCTCCCGGGAGGTCACCATATTGATGCCGAACTTAGTGCGGACACCCGATCGGCATAGCGCACTACAGCCCAGAACTCCTGGACTCAAGCGATCCTCCAGCCTCAGCCTCCCGAGTAGCTGGGACTACAG
+

And I only want to extract the 5 to 11 character which located in sequence part (for the first one is TNACGG, for the second is CNACCT) and makes it a new txt file. Can I do that?

Extracting specific lines from a large (compressed) text file, a friend asked me the following question: how to efficiently extract some specific lines from a large text file, possibily compressed by Gzip? The text file is compressed by Gzip, and we do not want to extract the whole file. The solution to the first point is still using sed: for example, to extractline 2, 4, and 6, the following command works. sed -n '2p;4p;6p' somefile.txt. Even better, we can actually generate this command and run it withinR, so that we do not need to manually type the command when the list is long.

How to extract gzip and tar files on the command line – WinZip , For example, if you need to extract the contents of a tar file or a gzip file, your commands would look like this: wzunzip file.tar [PATH]. wzunzip -d file.gz [PATH]. To extract (unzip) a tar.gz file simply-right click the file you want to extract and select “Extract”. Windows users will need a tool named 7zip to extract tar.gz files. The -v option will make the tar command more visible and print the names of the files being extracted on the terminal.

$ zcat fastq.gz | awk '(NR%5)==2{print substr($0,5,6)}'
TNACGG
CNACCT

Use zgrep to grep a gzip (gz) file, Linux users quickly learn how to use the Linux grep command on plain text files, but it takes a little longer to really you can grep gzip (gz) files as  Tap Extract file here. It's at the top of the pop-up window. Doing so immediately extracts your GZ folder to the DOWNLOADS section of the AndroZip app. You can select any of the extracted files to open and view them. You can also tap Extract to to select a different folder to extract to.

How can I unzip a .tar.gz in one step (using 7-Zip)?, A .tar.gz or .tgz file really is two formats: .tar is the archive, and .gz is the return !(defined(obj) && String(obj).length); } /* WSH-specific Utility Functions  ls -l access.log.gz -rw-r--r-- 1 root root 37 Sep 14 04:02 access.log.gz. Now use gunzip command to extract access.log.gz file using command. This will extract the file from archive and remove .gz file automatically. gunzip access.log.gz.

4 Ways to Extract a Gz File, How to Extract a Gz File. This wikiHow teaches you how to decompress and open a GZ folder, which is a type of compressed (ZIP) folder. A file using .tar.gz format is a file that has been created using the Unix-based archival application tar and then compressed using gzip compression.These files are often referred as “tarballs,” and while you can find them with a double extension (.tar.gz), the extension can also be shortened to .tgz or .gz.

Practical Booklet of Linux 1, -1 -2 salida salida1 comm -3 salida salida1 STEP 4: Sort text files. sort a) Help. sort --help b) -o name_file Send the ordered output to a file. gzip Syntax: gzip OPTION -t Check the integrity of a compressed file -d Unzip.gz files c) Unzip directly with estudiar c) Efemerides of a specific date. calendar -t 2009-05-02 date. For this, open a command-line terminal and then type the following commands to open and extract a .tar.gz file. Extracting .tar.gz files. 1) If your tar file is compressed using a gzip compressor, use this command to uncompress it. $ tar xvzf file.tar.gz. Where, x: This option tells tar to extract the files. v: The “v” stands for “verbose

Comments
  • Your latest edit seems invalid. The location of TAGAGG is certainly not in positions 5-11 in the first sequence, and TATAGG is not present in the sample data at all. You should not update to change the requirements drastically after receiving multiple answers, anyway.
  • @tripleee I am sry I have some misunderstanding with the fastq file. The CTTGTA is not a part of sequence you are right.
  • @tripleee it seems u delet your unswer. Could you show me that again?
  • I never posted an answer. JamesBrown appears to have undeleted his, if that's the one you are looking for.
  • Stack Overflow is a site for programming and development questions.
  • I am sorry I made some mistake the goal sequence I wanna to extract for the first one is TAGAGG, for the second is TATAGG.
  • It starts at the first line
  • I can't even find TAGAGG in the whole file.
  • Ah, it is on multiple rows. Bailing out until OP gets his question right.
  • Thank you, but your script directly prints the result out. I would like the result is saved as a file. Can I do that?
  • Redirect the output to a file zcat fastq.gz | sed -n '2~5{s/.\{4\}\(.\{6\}\).*/\1/;p}' > outputfile.txt