Openpyxl max_row and max_column wrongly reports a larger figure

openpyxl max row in a column
openpyxl find max row in column
openpyxl get max row
openpyxl max column with data
openpyxl find first empty row
openpyxl formula
openpyxl find last row
python loop through excel sheets

My query is to do with a function that is part of a parsing script Im developing. I am trying to write a python function to find the column number corresponding to a matched value in excel. The excel has been created on the fly with openpyxl, and it has the first row (from 3rd column) headers that each span 4 columns merged into one. In my subsequent function, I am parsing some content to be added to the columns corresponding to the matching headers. (Additional info: The content I'm parsing is blast+ output. I'm trying to create a summary spreadsheet with the hit names in each column with subcolumns for hits, gaps, span and identity. The first two columns are query contigs and its length. )

I had initially written a similar function for xlrd and it worked. But when I try to rewrite it for openpyxl, I find that the max_row and max_col function wrongly returns a larger number of rows and columns than actually present. For instance, I have 20 rows for this pilot input, but it reports it as 82. Note that I manually selected the empty rows & columns and right clicked and deleted them, as advised elsewhere in this forum. This didn't change the error.

def find_column_number(x):
    col = 0
    print "maxrow = ", hrsh.max_row
    print "maxcol = ", hrsh.max_column
    for rowz in range(hrsh.max_row):
        print "now the row is ", rowz
        if(rowz > 0): 
            pass
        for colz in range(hrsh.max_column):
            print "now the column is ", colz
            name = (hrsh.cell(row=rowz,column=colz).value)
            if(name == x):
                col = colz
    return col 

The issue with max_row and max_col, has been discussed here https://bitbucket.org/openpyxl/openpyxl/issues/514/cell-max_row-reports-higher-than-actual I applied the suggestion here. But the max_row is still wrong.

for row in reversed(hrsh.rows):
    values = [cell.value for cell in row]
    if any(values):
        print("last row with data is {0}".format(row[0].row))
        maxrow = row[0].row

I then tried the suggestion at https://www.reddit.com/r/learnpython/comments/3prmun/openpyxl_loop_through_and_find_value_of_the/, and tried to get the column values. Once, again the script takes into account the empty columns and reports a higher number columns than actually present.

for currentRow in hrsh.rows:
    for currentCell in currentRow:
        print(currentCell.value)

Can you please help me resolve this error, or suggest another method to achieve my aim?

As noted in the bug report you linked to there's a difference between a sheet's reported dimensions and whether these include empty rows or columns. If max_row and max_column are not reporting what you want to see then you will need to write your own code to find the first completely empty. The most efficient way, of course, would be to start from max_row and work backwards but the following is probably sufficient:

for max_row, row in enumerate(ws, 1):
    if all(c.value is None for c in row):
        break

[PDF] openpyxl Documentation, Bug reports and feature requests should be submitted using the issue tracker. updating documentation in virtually every area: many large features for row in ws.iter_rows(min_row=1, max_col=3, max_row=2): chart1.shape = 4 the max_row and max_column attributes should allow you to work with. Openpyxl max_row and max_column wrongly reports a larger figure Answer 08/31/2018 Developer FAQ 1 My query is to do with a function that is part of a parsing script Im developing.

I confirm the bug found by the OP. I found newer posts reporting max_row being too large. This bug cannot be fixed.

In my case, it appears when I set the value of all cells in a worksheet to None. After this operation, the worksheet still reports the old dimensions.

A call to ws.calculate_dimensions() does not change anything. Closing and restarting excel still has openpyxl report the same wrong dimensions.

This is a problem because ws.append() starts at ws.max_row, and there is no way to override this behaviour. You end up with a worksheet that is blank and then, somewhere down, the data you appended appears.

The only way I found out that remedies this bug is to delete entire rows by hand in excel. openpyxl then shows the correct max_row.

I found out that this is linked to the member ws._cells not being empty as it should after setting all cells to None. However, the user cannot delete this dictionary as it is a private member.

pandas Xlsxwriter Dynamically Access rows and Insert formatting , Openpyxl max_row and max_column wrongly reports a larger figure. My query is to do with a function that is part of a parsing script Im developing. I am trying to� I tried to do the following: ```python import openpyxl book = openpyxl.Workbook(guess_types=True, write_only=True) sheet = book.create_sheet(title=u'Translations') for _ in range(100): sheet.append(['one', 'two' 'three']) book.save('tst.xlsx') book = openpyxl.load_workbook('tst.xlsx', read_only=True) for sheet in book.worksheets: print sheet

I have the same behaviour with the latest version 3.0.3 of openpyxl. I use an XLSX file as a template (created from a XLS file), open it, add some data then save it with a different name. I find out that max_row is set to 49 and I don't know why.

However after reading in the online documentation https://openpyxl.readthedocs.io/en/stable/api/openpyxl.worksheet.worksheet.html this line:

Do not create worksheets yourself, use openpyxl.workbook.Workbook.create_sheet() instead

I created my XLSX template directly from openpyxl simply as follows:

wb = openpyxl.Workbook()
wb.save(filename="template.xslx")

It works fine now (max_row=1). Hope it helps.

514: Cell max_row reports higher-than-actual max row , You should enable JavaScript to work with this page. Atlassian. JavaScript load error. We tried to load scripts but something went wrong. Please make� When using load_workbook(file, read_only=True).active the returned worksheet is of the class ReadOnlyWorksheet. In the documentation it specifically states, as it was for the previous version before 2.6.x that to set max_row = max_column = None in order to avoid the bounds being default set to 1 and 1 respectively.

Chapter 12 – Working with Excel Spreadsheets, The openpyxl module allows your Python programs to read and modify Excel spreadsheet Figure 12-1 shows the tabs for the three default sheets named Sheet1, with the Worksheet object's max_row and max_column member variables. To access one particular tuple, you can refer to it by its index in the larger tuple. 내 쿼리는 구문 분석 스크립트 Im의 일부인 함수로 수행합니다. Excel에서 일치하는 값에 해당하는 열 번호를 찾기 위해 Python 함수를 작성하려고합니다. Excel은 openpyxl을 사용하여 즉석에서 작성되었으며, 첫 번째 행 (세 번째 열에서) 헤더가 있으며 각 행은 4 개의 열이 하나로 병합됩니다. 후속

A Guide to Excel Spreadsheets in Python With openpyxl – Real Python, Reading Excel Spreadsheets With openpyxl way to manipulate large datasets without any prior technical background. for row in sheet.iter_rows(min_row=1, . .. max_row=2, min_col=1, can use the min_column and max_column to easily get the data you want: Says my email address is wrong. Openpyxl max_row and max_column wrongly reports a larger figure Excluding multiple values from the array How instagram or direct not change shape in icon launcher to circle in android oreo

Openpyxl check for empty cell, Openpyxl max_row and max_column wrongly reports a larger figure , As noted in the bug report you linked to there's a difference between a sheet's reported� In this tutorial, we will see a demonstration on how to use Excel sheets in the python using openpyxl. Thanks for reading this article. If you like it, click on 👏 to rate it out of 50 and also

Comments
  • I tried this suggestion now, but it still shows the max rows to the same larger inflated number. Taking cue from this comment, I also tried what's suggested here stackoverflow.com/questions/40547394/… and looked for both None and empty string, but I still get the same larger inflated number.
  • What do you mean by larger number? max_row will be a local variable and have no direct affect on the worksheets dimensions.