Java library or text file that maps mime types to nice human friendly file types

Related searches

GOAL My goal is to find a text file or library that enables me to map when given a mime type input and return a nice human friendly format.

For example given the mime type for Word (as shown below) I would like a result that is something like "Microsoft Office Word Document".

application/vnd.openxmlformats-officedocument.wordprocessingml.document

I realise I could compile my own list and use something like a Map (Java) but then it would not be comprehensive etc.

SIMPLISTIC OPTION I know I can examine and return the sub mime type and keep the last component but that is not very sophisticated as per the Word mime type above the result would be a very generic "document". I could expand and take more components but the result is still quite ugly.

KEY/VALUE FILE Another option I have tried to find is a text file with key/value pairs where the key is the mime type in full and the value being the nice human friendly text.

text/plain=Plain Text File
application/octet-stream=Unknown binary file

This seems like a nice option but I have not been able to find a definitive text file with lots of entries. It would also be nice if a source for just the media( i prefer to call it the primary mime type) the "text" in "text/plain" was present so an unknown text mime type such as "text/unknown a.b.c" would return "Unknown text file/format".

Common MIME types, Here is a list of MIME types, associated by type of documents, ordered by their common extensions. text/plain is the default value for textual files. A textual file should be human-readable and must not contain binary data. application/octet- stream is the .jar, Java Archive (JAR), application/java-archive. The built-in mime-type list is very limited but a mechanism is available to add very easily more Mime Types/extensions. The MimetypesFileTypeMap looks in various places in the user's system for MIME types file entries. When requests are made to search for MIME types in the MimetypesFileTypeMap, it searches MIME types files in the following order:

Apache Tika supports MimeTypes. It also supports Content Detection by the way if you don't know the mime type. Anyway, it looks like you need to do:

String t = "text/plain";
org.apache.tika.mime.MimeTypes.getMimeType(t).getDescription();

Disclaimer: I didn't actually try it. Also, I don't know if it supports all mime types you need.

MimetypesFileTypeMap (Java Platform SE 7 ), the format is <mime type> <space separated file extensions> # for example: text/ plain txt text TXT # this would map file.txt, file.text, and file.TXT to # the mime type � If your Java program has Files and needs to figure out MIME types, this library will help. Just do this: import org.overviewproject.mime_types.MimeTypeDetector // File file = new File("foo.txt") String mimeType = MimeTypeDetector.detectMimeType(file); // "text/plain" // or

Use this library

this works by files,bytes,...

MimeUtil > https://github.com/saces/MimeUtil

usage:

MagicMimeMimeDetector g = new MagicMimeMimeDetector();
Collection<MimeType> list =  g.getMimeTypes(file);

if(list.size() > 0)
{
    MimeType mime = list.iterator().next();
    return mime.toString();
}

Mime type converter, A textual file should be human-readable and must not contain binary data. In other words, if you select a specific converter to map a MIME type to an eMail for Mac from our software library. txt to . mime file. jpg could be many different things. to text _decode decode image, javascript, convert to image, to string java b64� The presence of a header can be specified in the header parameter of the MIME type. The MIME type for CSV files officially registered with IANA is "text/csv". Each record may consist of one or more comma-separated fields, and the same number of fields should persist throughout the file (there should be an equal number of fields in all records).

List of file formats, This is a list of file formats used by computers, organized by type. Filename extensions are Some file formats, such as .txt or .text , may be listed multiple times. Building; cab – A cabinet (.cab) file is a library of compressed files stored as one file. cross-application, human readable, future proof format for storing font data. The java.nio.file.Files class provides the method probeContentType(Path) that "probes the content type of a file" through use of "the installed FileTypeDetector implementations" (the Javadoc also

Libraries for Reading and Writing CSVs in Java, The MIME type for CSV files officially registered with IANA is "text/csv". CSV files are human-readable, and simpler to understand than other data transfer POJO support for dealing with Java Beans, in addition to the usual lists and maps . Combines a dictionary lookup of an embedded resource text file containing several hundred Mime types with a call to the urlmon.dll function FindMimeFromData to determine a file's Mime type. This code is an enhanced version of Tim Schmelter's code from stackoverflow. I changed the Mime type lists from being hardcoded in the source code and put

So the important values that are used are 0, 0, 16, 1. This then lets us know that line 1 (lines are kept count by the semi colons) column 0 of the generated file maps to file 0 (array of files 0 is foo.js), line 16 at column 1. To show how the segments get decoded I will be referencing Mozilla's Source Map JavaScript library.

Comments
  • retagged, more tags may help you have more answers.. ;)
  • The dottoro and pdx-edu links are fairly good but hardly comprehensive.. any chance you have a more complete link ? Im not interested in lists that only give mime type = file extensions...
  • Thanks for spotting that. Inside tika-core.jar theres an xml file tika-mimetypes.xml which has a lot of mime types and descriptions defined within it. It looks like it should work... thanx again!
  • Most of the entries in the xml are ignored because for some strage reason tika is setting descriptions from tags called "_comment" but not "description" etc. Going to file an issue/patch..
  • this appears to be fixed from version 0.8 (issues.apache.org/jira/browse/TIKA-515)