How to find dos format files in a linux file system

dos2unix
how to check file format in unix
linux check if file is dos or unix
python dos2unix
check for carriage return in file
how to detect crlf
vim show line endings

I would like to find out which of my files in a directory are dos text files (as opposed to unix text files).

What I've tried:

find . -name "*.php" | xargs grep ^M -l

It's not giving me reliable results... so I'm looking for a better alternative.

Any suggestions, ideas?

Thanks

Clarification

In addition to what I've said above, the problem is that i have a bunch of dos files with no ^M characters in them (hence my note about reliability).

The way i currently determine whether a file is dos or not is through Vim, where at the bottom it says:

"filename.php" [dos] [noeol]

Not sure what you mean exactly by "not reliable" but you may want to try:

find . -name '*.php' -print0 | xargs -0 grep -l '^M$'

This uses the more atrocious-filenames-with-spaces-in-them-friendly options and only finds carriage returns immediately before the end of line.

Keep in mind that the ^M is a single CTRLM character, not two characters.

And also that it'll list files where even one line is in DOS mode, which is probably what you want anyway since those would have been UNIX files mangled by a non-UNIX editor.


Based on your update that vim is reporting your files as DOS format:

If vim is reporting it as DOS format, then every line ends with CRLF. That's the way vim works. If even one line doesn't have CR, then it's considered UNIX format and the ^M characters are visible in the buffer. If it's all DOS format, the ^M characters are not displayed:

Vim will look for both dos and unix line endings, but Vim has a built-in preference for the unix format. - If all lines in the file end with CRLF, the dos file format will be applied, meaning that each CRLF is removed when reading the lines into a buffer, and the buffer 'ff' option will be dos. - If one or more lines end with LF only, the unix file format will be applied, meaning that each LF is removed (but each CR will be present in the buffer, and will display as ^M), and the buffer 'ff' option will be unix.

If you really want to know what's in the file, don't rely on a too-smart tool like vim :-)

Use:

od -xcb input_file_name | less

and check the line endings yourself.

Detect file format is dos or unix - Paul, How do you find the format of a file in Unix? The final tests are language tests. The file is checked to see if it is a text file. By testing the first few bytes of a file, the test deduces whether the file is an ASCII, UTF-8, UTF-16, or another format that identifies the file as a text file.

How about:

find . -name "*.php" | xargs file | grep "CRLF"

I don't think it is reliable to try and use ^M to try and find the files.

Determine file type, can be used to convert from Unix to DOS. This tool comes in handy when sharing files between Windows and Linux machines. Howto: UNIX or Linux convert DOS newlines CR-LF to Unix/Linux format . THANK YOU! to who ever wrote this. I’ve been messing around with this for quite a while. This even works on AIX’s legacy version of sed. My day has been so made by this little nugget.

This is much like your original solution; therefore, it's possibly more easy for you to remember:

find . -name "*.php" | xargs grep "\r" -l

Thought process:

In VIM, to remove the ^M you type:

 %s:/^M//g

Where ^ is your Ctrl key and M is the ENTER key. But I could never remember the keys to type to print that sequence, so I've always removed them using:

 %s:/\r//g

So my deduction is that the \r and ^M are equivalent, with the former being easier to remember to type.

format (command), Not sure what you mean by "unreliability", but you can try: find . -name '*.php' -​print0 | xargs -0 grep -l '^M$'. It uses more brutal friendly-file names and only finds​  Most Linux log files are stored in a plain ASCII text file and are in the /var/log directory and subdirectory. Logs are generated by the Linux system daemon log, syslogd or rsyslogd . This tutorial will walk you through how to find and read Linux log files, and configure the system logging daemon.

i had good luck with

find . -name "*.php" -exec grep -Pl "\r" {} \;

dos2unix, EDIT: Silly me. Of course ^M is CR; and your command should work (works on my system). However, you need to type Ctrl-V Ctrl-M to get the  Despite the popularity of window managers that offer graphical user interfaces, the best way to search for files in Linux requires a shell. The find command, with its myriad options and switches, offers the most powerful and precise features to surface what you're looking for. All modern Linux distributions support the find command from the shell.

How to find dos format files on Linux file system, Detect file format is dos or unix- Detect file format with grep.$ grep '^M' your-file-​name ^M is Ctrl-V + Ctrl-M. If the grep returns any line, the file is  Remember, Linux is very particular about case, so if you’re looking for a file named Linux.odt, the following command will return no results. find / -name linux.odt If, however, you were to alter the command by using the -iname option, the find command would locate your file, regardless of case.

How to detect dos format files in git bash, Sometimes, you will need to move files between windows and unix systems. Window Now convert DOS file to UNIX format by using dos2unix command $​dos2unix Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above. Linux-system-​commands. A file system is a logical collection of files on a partition or disk. A partition is a container for information and can span an entire hard drive if desired. Your hard drive can have various partitions which usually contain only one file system, such as one file system housing the /file system or another containing the /home file system .

dos2unix and unix2dos commands, Convert DOS-Formatted Files to Unix-Format in Ubuntu and CentOS. Have you ever seen a bunch of ^M characters in a text file? what that odd ^M character is, why it is in some of your documents, and how to get rid of them. It means that text documents that come from a Windows system won't always play nice in Linux​. If you have been using your system for some time, this may take a while, because, even if you haven’t generated many files yourself, a Linux system and its apps are always logging, cacheing, and storing temporal files. The number of entries in the file system can grow quite quickly. Don’t feel overwhelmed, though. Instead, try this: tree -L 1 /

Convert DOS-Formatted Files to Unix-Format in Ubuntu and CentOS , Suppose your system has ffs=dos,unix and you open an existing file. Vim will look for both dos and unix line endings, but Vim has a built-in preference for the unix 

Comments
  • ... How exactly is it not reliable?
  • @ignacio What bvbp says. That is, i want to be able to find the property of the file rather than what the file contains
  • But it isn't a property of the file, it's what the file contains.
  • If it has no CRs then it's not a DOS format file.
  • @superspace, vim will detect a file as DOS if every line has CRLF otherwise it's UNIX. I'm not sure why you think those files aren't actually DOS format unless it's the missing ^M characters in vim's display, which is not a reliable indicator. See my answer update for the reason why, and the tool you should use to find out for certain.
  • Thanks for your response, ^M and ^M$ doesn't seem to return any more or less results for me
  • This is more like what I had in mind (that is, to find the property of the file rather than the content of the file). Unfortunately, a whole bunch of dos php files returned as "PHP script text" when passed through the file command instead of something about CRLF
  • for me this answer worked while the accepted answer did not work!
  • Thanks for your response, but unfortunately does not add anything to what i already have... I do use the same method to remove ^M if i'm in vim and use fromdos when i'm outside
  • This is more or less like the accepted answer, except it uses exec instead of xargs. I found xargs to be significantly faster, in this case at least.
  • this is a similar method to @pvpb ... but it is still lacking... not returning the results I expected (because all the PHP files report that they are "PHP script text" files)