Pandas open_excel() fails with xlrd.biffh.XLRDError: Can't find workbook in OLE2 compound document
I'm trying to use pandas to parse an .xlsm document. My code worked perfectly with the example file I was given, but once I got the rest of the documents, it failed with the above error. Here's the offending stack trace:
Traceback (most recent call last): File "@@@@@@@@/UnsupervisedCAM.py", line 9, in <module> info_dict = read_excel_to_dict('files/' + filename) File "@@@@@@@@\readCAM.py", line 7, in read_excel_to_dict df = pandas.read_excel(filename, parse_cols='E,G,I,K,Q,O') File "@@@@@@@@\Anaconda3\envs\tensorflow\lib\site-packages\pandas\io\excel.py", line 191, in read_excel io = ExcelFile(io, engine=engine) File "@@@@@@@@\Anaconda3\envs\tensorflow\lib\site-packages\pandas\io\excel.py", line 249, in __init__ self.book = xlrd.open_workbook(io) File "@@@@@@@@\Anaconda3\envs\tensorflow\lib\site-packages\xlrd\__init__.py", line 441, in open_workbook ragged_rows=ragged_rows, File "@@@@@@@@\Anaconda3\envs\tensorflow\lib\site-packages\xlrd\book.py", line 87, in open_workbook_xls ragged_rows=ragged_rows, File "@@@@@@@@\Anaconda3\envs\tensorflow\lib\site-packages\xlrd\book.py", line 595, in biff2_8_load raise XLRDError("Can't find workbook in OLE2 compound document") xlrd.biffh.XLRDError: Can't find workbook in OLE2 compound document
I'm not even sure where to start... Haven't found anything of use online.
I got the same error message and could solve it by removing the password protection of the xlsx-file. (not saying that it's the only reason for the error, but worth checking!)
python, After a lot of searching, the only way I've found to do this is to open and save all the excel documents, which seems to 'strip' them of their OLE2 format. In contrast, the file for the most recent year available, 2013, coalpublic2013.xlsfile, works without a problem: import pandas as pddf1 = pd.read_excel("coalpublic2013.xls") but the next decade of .xlsfiles (2004-2012) do not load. I have looked at these files with Excel, and they open, and are not corrupted.
After a lot of searching, the only way I've found to do this is to open and save all the excel documents, which seems to 'strip' them of their OLE2 format. I automated the process with the following vbs script:
Dim objFSO, objFolder, objFile Dim objExcel, objWB Set objExcel = CreateObject("Excel.Application") Set objFSO = CreateObject("scripting.filesystemobject") MyFolder = "<PATH/TO/FILES" Set objFolder = objfso.getfolder(myfolder) For Each objFile In objfolder.Files If Right(objFile.Name,4) = "<EXTENSION>" Then Set objWB = objExcel.Workbooks.Open(objFile) objWB.save objWB.close End If Next objExcel.Quit Set objExcel = Nothing Set objFSO = Nothing Wscript.Echo "Done"
Make sure to change the path to the folder and extension.
Pandas' read_excel, ExcelFile, failing to open some .xls files. · Issue , _position 1226 opcode = self.get2bytes() XLRDError: Unsupported format, Pandas support will say that it's an xlrd problem, not a pandas Pandas suddenly cannot open Excel file (can't find workbook in OLE2 compound document
In case you face this issue over Jupyter notebook as I did when searching for the error, you can simply restart the kernel and the issue gets resolved.
read_excel fails to read excel file when last sheet is empty and , Pandas fails to load an excel file as a dict fo dataframe when the last sheet is empty when XLRDError Traceback (most recent call last) in () can you check if saving an empty frame in-the-middle (e.g. not the end) work? From the comments I can see you realize this is not a 'real' Excel file, but rather, is an HTML file saved with the .xls extension. Since you don't provide us a full file we can only guess what may, or may not, work.
xlrd.XLRDError Python Example, You can vote up the examples you like or vote down the ones you don't like. dict((col, str) for col in string_columns) df = pandas.read_excel(filename, sheet_name, def __iter__(self): """Iterate over all of the lines in the file""" self.start() wb XLRDError as e: raise RowGeneratorError("Failed to open Excel workbook: Load password protected Excel files into Pandas DataFrame 1 minute read When trying to read an Excel file into a Pandas DataFrame gives you the following error, the issue might be that you are dealing with a password protected Excel file.
Load password protected Excel files into Pandas DataFrame, When trying to read an Excel file into a Pandas DataFrame gives you the following error, the issue might be that you are dealing with a password protected XLRDError: Can't find workbook in OLE2 compound document. I can't open my Excel file on python, and I think I've tried all the options out there. I am using a mac, and I don't think I typed in the path wrong..? I've installed xlrd and openpyxl as well, b
pandas.read_excel, iostr, bytes, ExcelFile, xlrd. A local file could be: file://localhost/path/to/table.xlsx . By file-like object, we refer to objects with a read() method, such as a file Stack Overflow for Teams is a private, secure spot for you and your coworkers to find and share information. Learn more Pandas.read_excel: Unsupported format, or corrupt file: Expected BOF record