StringIO and compatibility with 'with' statement (context manager)

stringio close
read stringio
stringio print
stringio clear
stringio pandas
stringio is not defined
convert stringio to bytesio
python stringio s3

I have some legacy code with a legacy function that takes a filename as an argument and processes the file contents. A working facsimile of the code is below.

What I want to do is not have to write to disk with some content that I generate in order to use this legacy function, so I though I could use StringIO to create an object in place of the physical filename. However, this does not work, as you can see below.

I thought StringIO was the way to go with this. Can anyone tell me if there is a way to use this legacy function and pass it something in the argument that isn't a file on disk but can be treated as such by the legacy function? The legacy function does have the with context manager doing work on the filename parameter value.

The one thing I came across in google was: http://bugs.python.org/issue1286, but that didn't help me...

Code

from pprint import pprint
import StringIO

    # Legacy Function
def processFile(filename):
    with open(filename, 'r') as fh:
        return fh.readlines()

    # This works
print 'This is the output of FileOnDisk.txt'
pprint(processFile('c:/temp/FileOnDisk.txt'))
print

    # This fails
plink_data = StringIO.StringIO('StringIO data.')
print 'This is the error.'
pprint(processFile(plink_data))

Output

This is the output in FileOnDisk.txt:

['This file is on disk.\n']

This is the error:

Traceback (most recent call last):
  File "C:\temp\test.py", line 20, in <module>
    pprint(processFile(plink_data))
  File "C:\temp\test.py", line 6, in processFile
    with open(filename, 'r') as fh:
TypeError: coercing to Unicode: need string or buffer, instance found

A StringIO instance is an open file already. The open command, on the other hand, only takes filenames, to return an open file. A StringIO instance is not suitable as a filename.

Also, you don't need to close a StringIO instance, so there is no need to use it as a context manager either.

If all your legacy code can take is a filename, then a StringIO instance is not the way to go. Use the tempfile module to generate a temporary filename instead.

Here is an example using a contextmanager to ensure the temp file is cleaned up afterwards:

import os
import tempfile
from contextlib import contextmanager

@contextmanager
def tempinput(data):
    temp = tempfile.NamedTemporaryFile(delete=False)
    temp.write(data)
    temp.close()
    try:
        yield temp.name
    finally:
        os.unlink(temp.name)

with tempinput('Some data.\nSome more data.') as tempfilename:
    processFile(tempfilename)

You can also switch to the newer Python 3 infrastructure offered by the io module (available in Python 2 and 3), where io.BytesIO is the more robust replacement for StringIO.StringIO / cStringIO.StringIO. This object does support being used as a context manager (but still can't be passed to open()).

Cheat Sheet: Writing Python 2-3 compatible code, Python 2 only: from StringIO import StringIO # or: from cStringIO import StringIO # Python 2 and 3: from io import BytesIO # for handling byte strings from io import  StringIO and compatibility with 'with' statement (context manager) I have some legacy code with a legacy function that takes a filename as an argument and processes the file contents. A working facsimile of the code is below.

you could define your own open function

fopen = open
def open(fname,mode):
    if hasattr(fname,"readlines"): return fname
    else: return fopen(fname,mode)

however with wants to call __exit__ after its done and StringIO does not have an exit method...

you could define a custom class to use with this open

class MyStringIO:
     def __init__(self,txt):
         self.text = txt
     def readlines(self):
          return self.text.splitlines()
     def __exit__(self):
          pass

Supporting Python 2 and Python 3, As a compatibility layer, we're using the future project http://python-future.org/ In Py2 there are three flavours of StringIO: a pure Python module (StringIO),  Here are the examples of the python api pandas.compat.StringIO taken from open source projects. By voting up you can indicate which examples are most useful and appropriate. By voting up you can indicate which examples are most useful and appropriate.

This one is based on the python doc of contextmanager

It's just wrapping StringIO with simple context, and when exit is called, it will return to the yield point, and properly close the StringIO. This avoids the need of making tempfile, but with large string, this will still eat up the memory, since StringIO buffer that string. It works well on most cases where you know the string data is not going to be long

from contextlib import contextmanager

@contextmanager
def buildStringIO(strData):
    from cStringIO import StringIO
    try:
        fi = StringIO(strData)
        yield fi
    finally:
        fi.close()

Then you can do:

with buildStringIO('foobar') as f:
    print(f.read()) # will print 'foobar'

Six: Python 2 and 3 Compatibility Library, StringIO in Python 3. six. BytesIO ¶. This is a fake file object for binary data. In Python 2, it's an  Equivalent to ::new except that when it is called with a block, it yields with the new instance and closes it, and returns the result which returned from the block.

Python 2 and 3 Compatibility: With Six and Python-Future Libraries, You learned a lot about compatibility with package imports. Download PackageImports/task.py from StringIO import StringIO def test_BytesIO(): fp = BytesIO()  Python2/3 compatibility - StringIO. Closed Public. Actions. Authored by serge-sans-paille on Dec 3 2018, 2:40 AM. Edit Revision; StringIO. Diff Detail. Repository

D55196 Python2/3 compatibility - StringIO - Phabricator, serge-sans-paille retitled this revision from Python2/3 compatibility to Python2/3 compatibility - StringIO.Dec 3 2018, 2:52 AM. michaelplatings  Python io – BytesIO, StringIO. Python io module allows us to manage the file-related input and output operations. The advantage of using IO module is that the classes and functions available allows us to extend the functionality to enable writing to the Unicode data.

Python Cookbook, scanner def firstword(line): print line.split()[0] string = StringIO('one\ntwo xxx\​nthree\n') scanner(string, firstword) StringIO objects are plug-and-play compatible​  Facing same error, even tho it just warns about StringIO, which, apparently is in IO, neither StringIO or bytes_to_str are in compat, and there are a lot of files calling it from there. After changing the sublibrary in all those files and coding the bytes_to_str function, there are still more sublibraries missing in compat, at least lmap.

Comments
  • you cant "Open" a stringIO instance
  • I'm using this solution. Here is a link to sample code that implements this directly: pastie.org/4450354 . Thank you to everyone that contributed here!
  • @mpettis: I've updated my answer to give an example using a context manager that will create the temporary file and clean it up for you in one go.
  • That is really an elegant way tho handle this... Thanks!
  • @MartijnPieters: is there a reason you rather unlink the file after the yield statement, instread of just closing it, or is it just for the sake of providing an example usage?
  • @mike: Because of the delete=False argument when it was created, the named temporary file will not be deleted as soon as it is closed — read the docs. Seems like that would have been fairly obvious from the temp.close() just before the yield temp.name statement...
  • Unfortunately that does not solve the problem since it would have to be inside of the legacy function
  • wouldnt this open override it as long as it was in the same file?
  • @jdi I think it might work if it was defined before the legacy function, i.e. when the legacy module is imported.
  • Actually the only way to make the legacy module pick up the custom open is to define the new open first, then import the legacy module, and do: legacy.open = open. Because the legacy module is using its own scope.
  • I started to make another answer but quickly realized it was only half the problem, which your second example covers. You could suggest using tempfile.SpooledTenporaryFile with a max_size=10e8 or something high. This will be a file-like object, using StringIO under the hood, and already has a context manager.
  • This can be done with the standard library: "with closing(StringIO(.... data ....)) as f:"