Converting Microsoft Office files to PDF files

28 May 2017
By Huiji Ying

My project was to convert old Microsoft Office documents .doc/.xls/.ppt files to PDF format in order to better preserve the data.

MS Word is one of the most popular word processors around. We have created so many new word documents every day since its existence, and rely on them to preserve our thoughts and data. Same with Excel and Powerpoint. However, technology moves on fast. Soon, these old formats will be un-openable, formats un-supported by the new Microsoft Office or any other softwares. We won’t be able to retrieve useable data from the files saved long ago any more.

Converting them into pdf is so far the most convenient way of preserving them. Pdf is also a format that people are familiar with. It doesn’t mess up the formatting or contents of your document. It doesn’t have confusing old or new versions that are  incompatible with each other; it’s just one .pdf. It is mobile-device friendly and supported by almost every system.

Although Microsoft Office provides built-in ability to save as pdf, it is quite laborious and often impossible to manually open every old document and re-save them as the pdf format. Therefore, I introduce to you a bash script that can be run on both Mac and Linux that can automatically convert all .doc/.xls/.ppt files in a folder (and its subfolders) to pdf format, and save them under the same filename in the same folder, with the help of Libre Office.

Discover the script.

Here are the instructions to run the script on a Mac or Linux:

If you have a folder of files that you want to convert to pdf,  which could include a lot of subfolders like this:

  • Open a Terminal (the black icon from the dock or search for “terminal” in Spotlight Search [cmd + space])
  • cd to the directory where you downloaded your script
  • Type in: ./ <parent-dir-containing-old-MS-files> and hit enter.
./ <parent-dir-containing-old-MS-files>
  • The converted pdf version will be created under the same folder where your old files are located.
  • The converted pdf will be named as <old-filename.old-suffix>.pdf.
  • The converted pdf will share the same attributes (read, write, execute permissions, timestamps, etc.) with the old MS files
  • LibreOffice is required to run this bash.