Read merged cells in Excel with Python

how to merge cells in excel using python openpyxl
xlrd merged cells
pandas read excel
how to add column in excel using python
how to merge two columns in excel using python
python update excel cell value
how to get row and column of cell in excel using python
exceljs merge cells

I am trying to read merged cells of Excel with Python using xlrd.

My Excel: (note that the first column is merged across the three rows)

    A   B   C
  +---+---+----+
1 | 2 | 0 | 30 |
  +   +---+----+
2 |   | 1 | 20 |
  +   +---+----+
3 |   | 5 | 52 |
  +---+---+----+

I would like to read the third line of the first column as equal to 2 in this example, but it returns ''. Do you have any idea how to get to the value of the merged cell?

My code:

all_data = [[]]
excel = xlrd.open_workbook(excel_dir+ excel_file)
sheet_0 = excel.sheet_by_index(0) # Open the first tab

for row_index in range(sheet_0.nrows):
    row= ""
    for col_index in range(sheet_0.ncols):
        value = sheet_0.cell(rowx=row_index,colx=col_index).value             
        row += "{0} ".format(value)
        split_row = row.split()   
    all_data.append(split_row)

What I get:

'2', '0', '30'
'1', '20'
'5', '52'

What I would like to get:

'2', '0', '30'
'2', '1', '20'
'2', '5', '52'

I just tried this and it seems to work for your sample data:

all_data = []
excel = xlrd.open_workbook(excel_dir+ excel_file)
sheet_0 = excel.sheet_by_index(0) # Open the first tab

prev_row = [None for i in range(sheet_0.ncols)]
for row_index in range(sheet_0.nrows):
    row= []
    for col_index in range(sheet_0.ncols):
        value = sheet_0.cell(rowx=row_index,colx=col_index).value
        if len(value) == 0:
            value = prev_row[col_index]
        row.append(value)
    prev_row = row
    all_data.append(row)

returning

[['2', '0', '30'], ['2', '1', '20'], ['2', '5', '52']]

It keeps track of the values from the previous row and uses them if the corresponding value from the current row is empty.

Note that the above code does not check if a given cell is actually part of a merged set of cells, so it could possibly duplicate previous values in cases where the cell should really be empty. Still, it might be of some help.

Additional information:

I subsequently found a documentation page that talks about a merged_cells attribute that one can use to determine the cells that are included in various ranges of merged cells. The documentation says that it is "New in version 0.6.1", but when i tried to use it with xlrd-0.9.3 as installed by pip I got the error

NotImplementedError: formatting_info=True not yet implemented

I'm not particularly inclined to start chasing down different versions of xlrd to test the merged_cells feature, but perhaps you might be interested in doing so if the above code is insufficient for your needs and you encounter the same error that I did with formatting_info=True.

How to read merged Excel cells with NaN into Pandas DataFrame , The referenced link you attempted needed to forward fill only the index column. For your use case, you need to fillna for all dataframe columns. I am trying to read merged cells of Excel with Python using xlrd. My Excel: (note that the first column is merged across the three rows) A B C +---+---+----+1 | 2 | 0 | 30 | + +---+----+2 | | 1 | 20 | + +---+----+3 | | 5 | 52 | +---+---+----+. I would like to read the third line of the first column as equal to 2 in this example, but it returns ''.

You can also try using fillna method available in pandas https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.fillna.html

df = pd.read_excel(dir+filename,header=1)
df[ColName] = df[ColName].fillna(method='ffill')

This should replace the cell's value with the previous value

how read and write merged cells in excel, Hi, I have below data in which some of the cells are merged, This data is saved in excel temp.xlsx, I want to read and write to another workbook  python,excel,cell,xlrd , Read merged cells in Excel with Python. I just tried this and it seems to work for your sample data: all_data = [] excel = xlrd.open_workbook(excel_dir+ excel_file) sheet_0 = excel.sheet_by_index(0) # Open the first tab prev_row = [None for i in range(sheet_0.ncols)] for row_index in range(sheet_0.nrows): row= [] for col_index in range(sheet_0.ncols): value = sheet_0.cell(rowx=row_index,colx=col_index).value if len(value)

For those who are looking for handling merged cell, the way OP has asked, while not overwriting non merged empty cells.

Based on OP's code and additional information given by @gordthompson's answers and @stavinsky's comment, The following code will work for excel files (xls, xlsx), it will read excel file's first sheet as a dataframe. For each merged cell, it will replicate that merged cell content over all the cells this merged cell represent, as asked by original poster.Note that merged_cell feature of xlrd for 'xls' file will only work if 'formatting_info' parameter is passed while opening workbook.

import pandas as pd
filepath = excel_dir+ excel_file
if excel_file.endswith('xlsx'):
    excel = pd.ExcelFile(xlrd.open_workbook(filepath), engine='xlrd')
elif excel_file.endswith('xls'):
    excel = pd.ExcelFile(xlrd.open_workbook(filepath, formatting_info=True), engine='xlrd')
else:
    print("don't yet know how to handle other excel file formats")
sheet_0 = excel.sheet_by_index(0) # Open the first tab
df = xls.parse(0, header=None) #read the first tab as a datframe

for e in sheet_0.merged_cells:
    rl,rh,cl,ch = e
    print e
    base_value = sheet1.cell_value(rl, cl)
    print base_value
    df.iloc[rl:rh,cl:ch] = base_value

For Excel input, copy value to all merged cells · Issue #171 , This also affects header parsing since people use merged cells in .com/​questions/30727017/read-merged-cells-in-excel-with-python  Hi, I have below data in which some of the cells are merged, This data is saved in excel temp.xlsx, I want to read and write to another workbook detail.xlsx Group Name Rank Group1 ABC 2 BGA 5 HJK 10 G

I was trying the previous solutions without having existo, nevertheless the following worked for me:

sheet = book.sheet_by_index(0)
all_data = []

for row_index in range(sheet.nrows):
    row = []
    for col_index in range(sheet.ncols):
        valor = sheet.cell(row_index,col_index).value
        if valor == '':
            for crange in sheet.merged_cells:
                rlo, rhi, clo, chi = crange
                if rlo <= row_index and row_index < rhi and clo <= col_index and col_index < chi:
                    valor = sheet.cell(rlo, clo).value
                    break
        row.append(valor)
    all_data.append(row)

print(all_data)

I hope it serves someone in the future

Example: Merging Cells - XlsxWriter, This program is an example of merging cells in a worksheet. A simple example of merging cells with the XlsxWriter Python module. jmcnamara@cpan.org # import xlsxwriter # Create an new Excel file and add a worksheet. workbook  I'm trying to read data from the excel file that has merged_cells_range but the output is not my goal. Pls help me out import openpyxl wb = openpyxl.load_workbook('book1.xlsx') sheet = wb.

Using XLRDs merged cells

ExcelFile = pd.read_excel("Excel_File.xlsx")
xl = xlrd.open_workbook("Excel_File.xlsx")
FirstSheet = xl.sheet_by_index(0)
for crange in FirstSheet.merged_cells:
    rlo, rhi,clo, chi = crange
    for rowx in range(rlo,rhi):
        for colx in range(clo,chi):
            value = FirstSheet.cell(rowx,colx).value
        if len(value) == 0:
            ExcelFile.iloc[rowx-1,colx] = FirstSheet.cell(rlo,clo).value

Getting Information on Excel's Merged Cells in xlrd Package , if a cell is a merged cell. There is something equivalent in Perl's excelParser (​mergedArea method), but not in Python's xlrd, or maybe I am  The first column is actually four cells merged vertically. When I read this using pandas.read_excel, I get a DataFrame that looks like this: Sample CD4 CD8 Day 1 8311 17.30 6.44 NaN 8312 13.60 3.50 NaN 8321 19.80 5.88 NaN 8322 13.50 4.09 Day 2 8311 16.00 4.92 NaN 8312 5.67 2.28 NaN 8321 13.00 4.34 NaN 8322 10.60 1.95

How to detect merged cells in a worksheet, This article explains how to detect all merged cells in a worksheet and unmerge them once using Spire.XLS. Here we used a template Excel file which has some​  I'm trying to read data from excel sheet that contains merged cells. When reading merged cells with openpyxl the first merged cell contain the value and the rest of the cells are empty. I would like to know about each cell if it merge and how many cells are merged but I couldn't find any function that do so.

Working with Spreadsheets using Python (Part 2), openpyxl - A Python library to read/write Excel 2010 xlsx/xlsm files The following code shows how to achieve merged cells using openpyxl :. Example: Merging Cells. This program is an example of merging cells in a worksheet. See the merge_range() method for more details.

How to Read an Excel Spreadsheet Which Has Merged Cells?, How can I read an Excel spreadsheet which has merged cells? Code below works only for few records. Private Sub cmd_imp_Click() 'On Error GoTo  One issue is that D5 cell is considered as float (instead of int or str) an other issue is that E column should be considered as datetime64[ns] header parameter of `read_excel can help: df = pd.read_excel("header_with_merged_cells.xlsx", skiprows=3, header=[0,1]) but we get a DataFrame like:

Comments
  • Can you make the question reproducible? We would like to see raw data and code you use to import it.
  • If you do a print all_data after the for loop, what do you get? And what do you expect?
  • Hello! I added the results too.
  • Even more info in this mailing list thread. formatting_info is unsupported for .xlsx files, unfortunately.
  • formatting_info doesn't needed for xlsx as I can see.
  • This will fill all na cells. Even if it is not merged