Pandas open_excel() fails with xlrd.biffh.XLRDError: Can't find workbook in OLE2 compound document

xlrderror('zip file contents not a known type of workbook)
xlrderror: can't find workbook in ole2 compound document
pandas excel
xlrderror: excel 2007 xlsb file; not supported
read password protected excel file in python pandas
xlwings

I'm trying to use pandas to parse an .xlsm document. My code worked perfectly with the example file I was given, but once I got the rest of the documents, it failed with the above error. Here's the offending stack trace:

Traceback (most recent call last):
  File "@@@@@@@@/UnsupervisedCAM.py", line 9, in <module>
    info_dict = read_excel_to_dict('files/' + filename)
  File "@@@@@@@@\readCAM.py", line 7, in read_excel_to_dict
    df = pandas.read_excel(filename, parse_cols='E,G,I,K,Q,O')
  File "@@@@@@@@\Anaconda3\envs\tensorflow\lib\site-packages\pandas\io\excel.py", line 191, in read_excel
    io = ExcelFile(io, engine=engine)
  File "@@@@@@@@\Anaconda3\envs\tensorflow\lib\site-packages\pandas\io\excel.py", line 249, in __init__
    self.book = xlrd.open_workbook(io)
  File "@@@@@@@@\Anaconda3\envs\tensorflow\lib\site-packages\xlrd\__init__.py", line 441, in open_workbook
    ragged_rows=ragged_rows,
  File "@@@@@@@@\Anaconda3\envs\tensorflow\lib\site-packages\xlrd\book.py", line 87, in open_workbook_xls
    ragged_rows=ragged_rows,
  File "@@@@@@@@\Anaconda3\envs\tensorflow\lib\site-packages\xlrd\book.py", line 595, in biff2_8_load
    raise XLRDError("Can't find workbook in OLE2 compound document")
xlrd.biffh.XLRDError: Can't find workbook in OLE2 compound document

I'm not even sure where to start... Haven't found anything of use online.


I got the same error message and could solve it by removing the password protection of the xlsx-file. (not saying that it's the only reason for the error, but worth checking!)

python, After a lot of searching, the only way I've found to do this is to open and save all the excel documents, which seems to 'strip' them of their OLE2 format. In contrast, the file for the most recent year available, 2013, coalpublic2013.xlsfile, works without a problem: import pandas as pddf1 = pd.read_excel("coalpublic2013.xls") but the next decade of .xlsfiles (2004-2012) do not load. I have looked at these files with Excel, and they open, and are not corrupted.


After a lot of searching, the only way I've found to do this is to open and save all the excel documents, which seems to 'strip' them of their OLE2 format. I automated the process with the following vbs script:

Dim objFSO, objFolder, objFile
Dim objExcel, objWB
Set objExcel = CreateObject("Excel.Application")
Set objFSO = CreateObject("scripting.filesystemobject")
   MyFolder = "<PATH/TO/FILES"
Set objFolder = objfso.getfolder(myfolder)
For Each objFile In objfolder.Files
If Right(objFile.Name,4) = "<EXTENSION>" Then
Set objWB = objExcel.Workbooks.Open(objFile)
objWB.save
objWB.close
End If
Next
objExcel.Quit
Set objExcel = Nothing
Set objFSO = Nothing
Wscript.Echo "Done"

Make sure to change the path to the folder and extension.

Pandas' read_excel, ExcelFile, failing to open some .xls files. · Issue , _position 1226 opcode = self.get2bytes() XLRDError: Unsupported format, Pandas support will say that it's an xlrd problem, not a pandas  Pandas suddenly cannot open Excel file (can't find workbook in OLE2 compound document


In case you face this issue over Jupyter notebook as I did when searching for the error, you can simply restart the kernel and the issue gets resolved.

read_excel fails to read excel file when last sheet is empty and , Pandas fails to load an excel file as a dict fo dataframe when the last sheet is empty when XLRDError Traceback (most recent call last) in () can you check if saving an empty frame in-the-middle (e.g. not the end) work? From the comments I can see you realize this is not a 'real' Excel file, but rather, is an HTML file saved with the .xls extension. Since you don't provide us a full file we can only guess what may, or may not, work.


xlrd.XLRDError Python Example, You can vote up the examples you like or vote down the ones you don't like. dict((col, str) for col in string_columns) df = pandas.read_excel(filename, sheet_name, def __iter__(self): """Iterate over all of the lines in the file""" self.​start() wb XLRDError as e: raise RowGeneratorError("Failed to open Excel workbook:  Load password protected Excel files into Pandas DataFrame 1 minute read When trying to read an Excel file into a Pandas DataFrame gives you the following error, the issue might be that you are dealing with a password protected Excel file.


Load password protected Excel files into Pandas DataFrame, When trying to read an Excel file into a Pandas DataFrame gives you the following error, the issue might be that you are dealing with a password protected XLRDError: Can't find workbook in OLE2 compound document. I can't open my Excel file on python, and I think I've tried all the options out there. I am using a mac, and I don't think I typed in the path wrong..? I've installed xlrd and openpyxl as well, b


pandas.read_excel, iostr, bytes, ExcelFile, xlrd. A local file could be: file://localhost/path/to/table.xlsx . By file-like object, we refer to objects with a read() method, such as a file  Stack Overflow for Teams is a private, secure spot for you and your coworkers to find and share information. Learn more Pandas.read_excel: Unsupported format, or corrupt file: Expected BOF record