I want to ask if it is possible to convert text files such as word document or text document to PDF using R ? I thought of converting it to .rmd and then to PDF using this code

my_text <- readLines("C:/.../track.txt")
cat(my_text, sep="  \n", file = "my_text.Rmd")
render("my_text.Rmd", pdf_document())

But it doesn't work showing this error:

Error: Failed to compile my_text.tex. In addition: Warning message: running command '"pdflatex" -halt-on-error -interaction=batchmode "my_text.tex"' had status 127

Is there any other solution ?

Install wkhtmltopdf and then from R run the following. Change the first three lines as appropriate depending on where wkhtmltopdf is on your system and depending on the input and output file paths and names.

wkhtmltopdf <- "C:\\Program Files\\wkhtmltopdf\\bin\\wkhtmltopdf.exe"
input <- "in.txt"
output <- "out.pdf"
cmd <- sprintf('"%s" "%s" -o "%s"', wkhtmltopdf, input, output)
.docx to .pdf

Install pandoc, modify the first three lines below as needed and run. How well this works may vary depending on your input.

pandoc <- "C:\\Program Files (x86)\\Pandoc\\pandoc.exe"
input <- "in.docx"
output <- "out.pdf"
cmd <- sprintf('"%s" "%s" -o "%s"', pandoc, input, output)

I absolutely have not been able to make the Pandoc method work for me.

I did figure out a way to convert docx to PDF using RDCOMClient, however.


file <- "C:/path/to your/doc.docx"

wordApp <- COMCreate("Word.Application")  # create COM object
wordApp[["Visible"]] <- TRUE #opens a Word application instance visibly
wordApp[["Documents"]]$Add() #adds new blank docx in your application
wordApp[["Documents"]]$Open(Filename=file) #opens your docx in wordApp

wordApp[["ActiveDocument"]]$SaveAs("C:/path/to your/new.pdf", 
FileFormat=17) #FileFormat=17 saves as .PDF

wordApp$Quit() #quit wordApp

I found the FileFormat=17 bit here

Hopefully this helps!

.docx to .pdf with libreoffice

As suggested here by JeanVuda, you can also convert .docx to .pdf with libreoffice, assuming you've made an install of libreoffice on your machine.

The following code convert a .docx file to .pdf using libreoffice :

docfile <- "X:/path_to_your_docx/yourdocxfile.docx" 
# Indicate the correct path for the .docx file you want to convert

system(paste("X:/path_to_libreoffice/program/soffice.exe --headless --convert-to pdf", docfile), intern = TRUE)
# Indicate the correct path where libreoffice executable is located on your machine,
# convert .docx to .pdf with libreoffice.
Feedback on libreoffice :
  1. Where my pandoc version fail to convert .docx to a .pdf and RDCOMClient is not available for my version of R, libreoffice provide a fast and direct way to convert word document in multiple format.

  2. Please note that for the .pdf conversion, the tables don't render correctly in the .pdf (but are printed in landscape mode), and the most direct way I can find is to transform my tables in images during the knitting of the word document with kableExtra::as_image(), which is maybe not appropriate for what you need.

  3. There are previous questions about command line converting to others format here, and I guess the original answer in ReporteR discussion which introducing this method for the useRs is that one.

Best regards

  • What OS are you working on?
  • I am using windows 7
  • You might need to install MikTeX and pandoc
  • "text files such as word document or text document" - different types of file will need a different procedure. You may like to narrow the scope of your question
  • still having an error, I think that it is related to my computer: Warning messages: 1: running command 'C:\Windows\system32\cmd.exe /c "C://Program Files (x86)/Pandoc/pandoc.exe" "C:/Users/../TMP-GAF01 - Curriculum Vitae_MJ.doc" -o "C:/Users/../CV_J.pdf"' had status 1 2: In shell(cmd) : l'exécution de '"C://Program Files (x86)/Pandoc/pandoc.exe" "C:/Users/.../TMP-- Curriculum.doc" -o "C:/Users/.../CV_J.pdf"' a échoué avec le code d'erreur 1.
  • The question is about .docx files. That is not the same as .doc.
  • This code worked nicely, I just added wordApp[["ActiveDocument"]]$Close(SaveChanges = 0) before the Quit line to save the document with no changes.