StringIO and compatibility with 'with' statement (context manager)
stringio is not defined
convert stringio to bytesio
python stringio s3
I have some legacy code with a legacy function that takes a filename as an argument and processes the file contents. A working facsimile of the code is below.
What I want to do is not have to write to disk with some content that I generate in order to use this legacy function, so I though I could use
StringIO to create an object in place of the physical filename. However, this does not work, as you can see below.
StringIO was the way to go with this. Can anyone tell me if there is a way to use this legacy function and pass it something in the argument that isn't a file on disk but can be treated as such by the legacy function? The legacy function does have the
with context manager doing work on the
filename parameter value.
The one thing I came across in google was: http://bugs.python.org/issue1286, but that didn't help me...
from pprint import pprint import StringIO # Legacy Function def processFile(filename): with open(filename, 'r') as fh: return fh.readlines() # This works print 'This is the output of FileOnDisk.txt' pprint(processFile('c:/temp/FileOnDisk.txt')) print # This fails plink_data = StringIO.StringIO('StringIO data.') print 'This is the error.' pprint(processFile(plink_data))
This is the output in
['This file is on disk.\n']
This is the error:
Traceback (most recent call last): File "C:\temp\test.py", line 20, in <module> pprint(processFile(plink_data)) File "C:\temp\test.py", line 6, in processFile with open(filename, 'r') as fh: TypeError: coercing to Unicode: need string or buffer, instance found
StringIO instance is an open file already. The
open command, on the other hand, only takes filenames, to return an open file. A
StringIO instance is not suitable as a filename.
Also, you don't need to close a
StringIO instance, so there is no need to use it as a context manager either.
If all your legacy code can take is a filename, then a
StringIO instance is not the way to go. Use the
tempfile module to generate a temporary filename instead.
Here is an example using a contextmanager to ensure the temp file is cleaned up afterwards:
import os import tempfile from contextlib import contextmanager @contextmanager def tempinput(data): temp = tempfile.NamedTemporaryFile(delete=False) temp.write(data) temp.close() try: yield temp.name finally: os.unlink(temp.name) with tempinput('Some data.\nSome more data.') as tempfilename: processFile(tempfilename)
You can also switch to the newer Python 3 infrastructure offered by the
io module (available in Python 2 and 3), where
io.BytesIO is the more robust replacement for
cStringIO.StringIO. This object does support being used as a context manager (but still can't be passed to
Cheat Sheet: Writing Python 2-3 compatible code, Python 2 only: from StringIO import StringIO # or: from cStringIO import StringIO # Python 2 and 3: from io import BytesIO # for handling byte strings from io import StringIO and compatibility with 'with' statement (context manager) I have some legacy code with a legacy function that takes a filename as an argument and processes the file contents. A working facsimile of the code is below.
you could define your own open function
fopen = open def open(fname,mode): if hasattr(fname,"readlines"): return fname else: return fopen(fname,mode)
however with wants to call __exit__ after its done and StringIO does not have an exit method...
you could define a custom class to use with this open
class MyStringIO: def __init__(self,txt): self.text = txt def readlines(self): return self.text.splitlines() def __exit__(self): pass
Supporting Python 2 and Python 3, As a compatibility layer, we're using the future project http://python-future.org/ In Py2 there are three flavours of StringIO: a pure Python module (StringIO), Here are the examples of the python api pandas.compat.StringIO taken from open source projects. By voting up you can indicate which examples are most useful and appropriate. By voting up you can indicate which examples are most useful and appropriate.
This one is based on the python doc of contextmanager
It's just wrapping StringIO with simple context, and when exit is called, it will return to the yield point, and properly close the StringIO. This avoids the need of making tempfile, but with large string, this will still eat up the memory, since StringIO buffer that string. It works well on most cases where you know the string data is not going to be long
from contextlib import contextmanager @contextmanager def buildStringIO(strData): from cStringIO import StringIO try: fi = StringIO(strData) yield fi finally: fi.close()
Then you can do:
with buildStringIO('foobar') as f: print(f.read()) # will print 'foobar'
Six: Python 2 and 3 Compatibility Library, StringIO in Python 3. six. BytesIO ¶. This is a fake file object for binary data. In Python 2, it's an Equivalent to ::new except that when it is called with a block, it yields with the new instance and closes it, and returns the result which returned from the block.
Python 2 and 3 Compatibility: With Six and Python-Future Libraries, You learned a lot about compatibility with package imports. Download PackageImports/task.py from StringIO import StringIO def test_BytesIO(): fp = BytesIO() Python2/3 compatibility - StringIO. Closed Public. Actions. Authored by serge-sans-paille on Dec 3 2018, 2:40 AM. Edit Revision; StringIO. Diff Detail. Repository
D55196 Python2/3 compatibility - StringIO - Phabricator, serge-sans-paille retitled this revision from Python2/3 compatibility to Python2/3 compatibility - StringIO.Dec 3 2018, 2:52 AM. michaelplatings Python io – BytesIO, StringIO. Python io module allows us to manage the file-related input and output operations. The advantage of using IO module is that the classes and functions available allows us to extend the functionality to enable writing to the Unicode data.
Python Cookbook, scanner def firstword(line): print line.split() string = StringIO('one\ntwo xxx\nthree\n') scanner(string, firstword) StringIO objects are plug-and-play compatible Facing same error, even tho it just warns about StringIO, which, apparently is in IO, neither StringIO or bytes_to_str are in compat, and there are a lot of files calling it from there. After changing the sublibrary in all those files and coding the bytes_to_str function, there are still more sublibraries missing in compat, at least lmap.
- you cant "Open" a stringIO instance
- I'm using this solution. Here is a link to sample code that implements this directly: pastie.org/4450354 . Thank you to everyone that contributed here!
- @mpettis: I've updated my answer to give an example using a context manager that will create the temporary file and clean it up for you in one go.
- That is really an elegant way tho handle this... Thanks!
- @MartijnPieters: is there a reason you rather unlink the file after the yield statement, instread of just closing it, or is it just for the sake of providing an example usage?
- @mike: Because of the
delete=Falseargument when it was created, the named temporary file will not be deleted as soon as it is closed — read the docs. Seems like that would have been fairly obvious from the
temp.close()just before the
- Unfortunately that does not solve the problem since it would have to be inside of the legacy function
- wouldnt this open override it as long as it was in the same file?
- @jdi I think it might work if it was defined before the legacy function, i.e. when the legacy module is imported.
- Actually the only way to make the legacy module pick up the custom open is to define the new
openfirst, then import the legacy module, and do:
legacy.open = open. Because the legacy module is using its own scope.
- I started to make another answer but quickly realized it was only half the problem, which your second example covers. You could suggest using tempfile.SpooledTenporaryFile with a
max_size=10e8or something high. This will be a file-like object, using StringIO under the hood, and already has a context manager.
- This can be done with the standard library: "with closing(StringIO(.... data ....)) as f:"