Fileinfo detecting .docx files as zip files


I had an issue today where a Word document uploaded to our CMS was being detected by Fileinfo as a zip archive, and was stored as such. This then caused problems when viewing the file, since it was served back to the user with an application/x-zip mime type, causing it to be opened by the wrong application.

This is actually not a bug in the Fileinfo extension, PHP or the browser. It seems that later versions of Word can save documents as zip files (with a .docx extension) in order to keep them small. Since Fileinfo can only use the data in the file to detect type, it has no way of knowing the difference between a file like this and actual zip files.

If you are storing uploaded files on the file system using the original filename, you shouldn't have any problems. The solution I put in place was to check the browser-supplied original filename if a zip file was detected, and if the original filename had a .docx extension, store the file as a Word document.

