AttributeError: 'generator' object has no attribute 'to_sql' While creating datframe using generator

python raise attributeerror
how to handle attribute error in python
how to remove attribute error in python
python attributeerror object has no attribute
attributeerror django
python custom exception
<class 'attributeerror'>
attributeerror pplayoutblank

I am trying to create a datafrmae from fixedwidth file and load into postgresql database. My input file is very huge (~16GB) and 20Million records. So if i create dataframe it is consuming most of the available RAM. It is taking long time to complete. So i thought of using chunksize(using python generator) option and commit records to table. But it is failing with 'AttributeError: 'generator' object has no attribute 'to_sql' error.

Inspired by this answer here https://stackoverflow.com/a/47257676/2799214

input file: test_file.txt

XOXOXOXOXOXO9
AOAOAOAOAOAO8
BOBOBOBOBOBO7
COCOCOCOCOCO6
DODODODODODO5
EOEOEOEOEOEO4
FOFOFOFOFOFO3
GOGOGOGOGOGO2
HOHOHOHOHOHO1

sample.py

import pandas.io.sql as psql
import pandas as pd
from sqlalchemy import create_engine

def chunck_generator(filename, header=False,chunk_size = 10 ** 5):
    for chunk in pd.read_fwf(filename, colspecs=[[0,12],[12,13]],index_col=False,header=None, iterator=True, chunksize=chunk_size):
        yield (chunk)

def _generator( engine, filename, header=False,chunk_size = 10 ** 5):
    chunk = chunck_generator(filename, header=False,chunk_size = 10 ** 5)
    chunk.to_sql('sample_table', engine, if_exists='replace', schema='sample_schema', index=False)
    yield row

if __name__ == "__main__":
    filename = r'test_file.txt'
    engine = create_engine('postgresql://ABCD:ABCD@ip:port/database')
    c = engine.connect()
    conn = c.connection
    generator = _generator(engine=engine, filename=filename)
    while True:
       print(next(generator))
    conn.close()

Error:

    chunk.to_sql('sample_table', engine, if_exists='replace', schema='sample_schema', index=False)
AttributeError: 'generator' object has no attribute 'to_sql'

My Primary goal is to improve performance. Please help me in resolving the issue or please suggest better approach. Thanks in advance.


'chunck_generator' will return a 'generator' object not a actual elemnent of the chunk. You need to iterate the object to get the chunk out of it.

>>> def my_generator(x):
...     for y in range(x):
...         yield y
...
>>> g = my_generator(10)
>>> print g.__class__
<type 'generator'>
>>> ele = next(g, None)
>>> print ele
0
>>> ele = next(g, None)
>>> print ele
1

So to fix your code you just need to either loop over the generator

for chunk in chunck_generator(filename, header=False,chunk_size = 10 ** 5):
    yeild chunk.to_sql()

But it seems convuluted. I owuld just do this:

import pandas.io.sql as psql
import pandas as pd
from sqlalchemy import create_engine

def sql_generator(engine, filename, header=False,chunk_size = 10 ** 5):
    frame = pd.read_fwf(
        filename, 
        colspecs=[[0,12],[12,13]],
        index_col=False,
        header=None, 
        iterator=True, 
        chunksize=chunk_size
    ):

    for chunk in frame:
        yield chunk.to_sql(
            'sample_table', 
            engine, 
            if_exists='replace', 
            schema='sample_schema', 
            index=False
        )


if __name__ == "__main__":
    filename = r'test_file.txt'
    engine = create_engine('postgresql://USEE:PWD@IP:PORT/DB')
    for sql in sql_generator(engine, filename):
        print sql

6. Built-in Exceptions, The following exceptions are the exceptions that are actually raised. exception AssertionError ¶. Raised when an assert statement fails. exception AttributeError ¶. The AttributeError in Python is raised when an invalid attribute reference is made, or when an attribute assignment fails. While most objects support attributes, those that do not will merely raise a TypeError when an attribute access attempt is made.


Conclusion: to_sql method is not efficient to load large files. So i used copy_from method in package psycopg2 and used chunksize option while creating dataframe. Loaded 9.8 Million records(~17GB) with 98 columns each in 30mins.

I have removed original refrences of my actual file ( iam using sample file in the original post).

import pandas as pd
import psycopg2
import io

def sql_generator(cur,con, filename, boundries, col_names, header=False,chunk_size = 2000000):
    frame = pd.read_fwf(filename,colspecs=boundries,index_col=False,header=None,iterator=True,chunksize=chunk_size,names=col_names)
    for chunk in frame:
        output = io.StringIO()
        chunk.to_csv(output, sep='|', quoting=3, escapechar='\\' , index=False, header=False,encoding='utf-8')
        output.seek(0)
        cur.copy_from(output, 'sample_schema.sample_table', null="",sep="|")
        yield con.commit()

if __name__ == "__main__":
    boundries = [[0,12],[12,13]]
    col_names = ['col1','col2']
    filename = r'test_file.txt'  #Refer to sample file in the original post
    con = psycopg2.connect(database='database',user='username', password='pwd', host='ip', port='port')
    cur = con.cursor()
    for sql in sql_generator(cur,con, filename, boundries, col_names):
        print(sql)
    con.close()

Python Exception Handling - AttributeError, The AttributeError in Python is raised when an invalid attribute reference is made, or when an attribute assignment fails. While most objects  @desertnaut frankly, this question is 2 months old and I forgot it long time ago :). Sometimes I visit so many questions that I don't remeber questions from previous hour :) – furas 2 days ago


I suggested you something like:

def _generator( engine, filename, ...):
    for chunk in pd.read_fwf(filename, ...):
        yield chunk.to_sql('sample_table', engine, ...)  # not sure about this since row was not define

for row in _generator(engine=engine, filename=filename)
    print(row)

Python: AttributeError, These errors yields to the program not executed. One of the error in Python mostly occurs is “ AttributeError “. AttributeError can be defined as an error that is raised  I ran into this problem when I checked out an older version of a repository from git. Git replaced my .py files, but left the untracked .pyc files. Since the .py files and .pyc files were out of sync, the import command in a .py file could not find the corresponding module in the .pyc files.


Error Encyclopedia, Attribute Error. Attributes in Python. I usually think about attributes as nouns that belong to an object. For example, “the student has two eyes”. But in Python, an  Stack Overflow Public questions and answers; Teams Private questions and answers for your team; Enterprise Private self-hosted questions and answers for your enterprise; Talent Hire technical talent


Why does this AttributeError in python occur?, This happens because the scipy module doesn't have any attribute named sparse . That attribute only gets defined when you import  Recent in Python. AttributeError: module 'numpy' has no attribute '__version__' 49 minutes ago how to install kivy module in spyder? 7 hours ago I am not able to install onedrivesdk in linux: it is giving "Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output" 2 days ago


How to Get Rid of an Attribute Error in Python, When you get an attribute error in Python, it means you tried to access the attribute value of, or assign an attribute value to, a Python object or class instance in  AttributeError: 'Model' object has no attribute '_get_distribution_strategy' #1239. Open milansoliya4210 opened this issue Dec 24, 2019 · 7 comments Open