How to optimize ASCII output with QTextStream

qtextstream example
qtextstream to qstring
qtextstream write to file
qtextstream endl
qtextstream::readline
qfile
qdatastream
qtextstream stdout

I'm currently writing out billions of binary records to ASCII files (ugh). I've got things working just fine, but I'd like to optimize the performance if I can. The problem is, the user is allowed to select any number of fields to output, so I can't know at compile-time which of 3-12 fields they'll include.

Is there a faster way to construct lines of ASCII text? As you can see, the types of the fields vary quite a bit and I can't think of a way around the series of if() statements. The output ASCII file has one line per record, so I've tried using a template QString constructed with arg, but that just slowed things down about 15%.

A faster solution doesn't have to use QTextStream, or necessarily write directly to the file, but the output is too large to write the whole thing to memory.

Here's some sample code:

QFile outfile(outpath);
if(!outfile.open(QIODevice::WriteOnly | QIODevice::Text | QIODevice::Truncate))
{
    qWarning("Could not open ASCII for writing!");
    return false;
} else
{
    /* compute XYZ precision */
    int prec[3] = {0, 0, 0}; //these non-zero values are determined programmatically

    /* set up the writer */
    QTextStream out(&outfile);
    out.setRealNumberNotation(QTextStream::FixedNotation);
    out.setRealNumberPrecision(3);
    QString del(config.delimiter); //the user chooses the delimiter character (comma, tab, etc) - using QChar is slower since it has to be promoted to QString anyway

    /* write the header line */
    out << "X" << del << "Y" << del << "Z";
    if(config.fields & INTFIELD)
        out << del << "IntegerField";
    if(config.fields & DBLFIELD)
        out << del << "DoubleField";
    if(config.fields & INTFIELD2)
        out << del << "IntegerField2";
    if(config.fields & TRIPLEFIELD)
        out << del << "Tri1" << del << "Tri2" << del << "Tri3";
    out << "\n";

    /* write out the points */
    for(quint64 ptnum = 0; ptnum < numpoints; ++ptnum)
    {
        pt = points.at(ptnum);
        out.setRealNumberPrecision(prec[0]);
        out << pt->getXYZ(0);
        out.setRealNumberPrecision(prec[1]);
        out << del << pt->getXYZ(1);
        out.setRealNumberPrecision(prec[2]);
        out << del << pt->getXYZ(2);
        out.setRealNumberPrecision(3);
        if(config.fields & INTFIELD)
            out << del << pt->getIntValue();
        if(config.fields & DBLFIELD)
            out << del << pt->getDoubleValue();
        if(config.fields & INTFIELD2)
            out << del << pt->getIntValue2();
        if(config.fields & TRIPLEFIELD)
        {
            out << del << pt->getTriple(0);
            out << del << pt->getTriple(1);
            out << del << pt->getTriple(2);
        }
        out << "\n";
    } //end for every point
outfile.close();

(This doesn't answer the profiler question. It tries to answer the original question, which is the performance issue.)

I would suggest avoiding the use of QTextStream altogether in this case to see if that helps. The reason it might help with performance is that there's overhead involved, because text is encoded internally to UTF-16 for storage, and then decoded again to ASCII or UTF-8 when writing it out. You have two conversions there that you don't need.

Try using only the standard C++ std::ostringstream class instead. It's very similar to QTextStream and only minor changes are needed in your code. For example:

#include <sstream>

// ...

QFile outfile(outpath);
if (!outfile.open(QIODevice::WriteOnly | QIODevice::Text
                | QIODevice::Truncate))
{
    qWarning("Could not open ASCII for writing!");
    return false;
}

/* compute XYZ precision */
int prec[3] = {0, 0, 0};

std::ostringstream out;
out.precision(3);
std::fixed(out);
// I assume config.delimiter is a QChar.
char del = config.delimiter.toLatin1();

/* write the header line */
out << "X" << del << "Y" << del << "Z";
if(config.fields & INTFIELD)
    out << del << "IntegerField";
if(config.fields & DBLFIELD)
    out << del << "DoubleField";
if(config.fields & INTFIELD2)
    out << del << "IntegerField2";

if(config.fields & TRIPLEFIELD)
    out << del << "Tri1" << del << "Tri2" << del << "Tri3";
out << "\n";

/* write out the points */
for(quint64 ptnum = 0; ptnum < numpoints; ++ptnum)
{
    pt = points.at(ptnum);
    out.precision(prec[0]);
    out << pt->getXYZ(0);
    out.precision(prec[1]);
    out << del << pt->getXYZ(1);
    out.precision(prec[2]);
    out << del << pt->getXYZ(2);
    out.precision(3);
    if(config.fields & INTFIELD)
        out << del << pt->getIntValue();
    if(config.fields & DBLFIELD)
        out << del << pt->getDoubleValue();
    if(config.fields & INTFIELD2)
        out << del << pt->getIntValue2();
    if(config.fields & TRIPLEFIELD)
    {
        out << del << pt->getTriple(0);
        out << del << pt->getTriple(1);
        out << del << pt->getTriple(2);
    }
    out << "\n";

    // Write out the data and empty the stream.
    outfile.write(out.str().data(), out.str().length());
    out.str("");
}
outfile.close();

qt - How to optimize ASCII output with QTextStream, (This doesn't answer the profiler question. It tries to answer the original question, which is the performance issue.) I would suggest avoiding the� Output: Writing finished "Debussy" "Rabel" "" Reading finished It's also common to use QTextStream to read console input and write console output. QTextStream is locale aware, and will automatically decode standard input using the correct codec. Example:

Given that you are writing out billions of records you might consider using the boost karma library:

http://www.boost.org/doc/libs/1_54_0/libs/spirit/doc/html/spirit/karma.html

According to their benchmark it runs much faster than C++ streams and even sprintf with most compilers/libraries, including Visual C++ 2010:

http://www.boost.org/doc/libs/1_54_0/libs/spirit/doc/html/spirit/karma/performance_measurements/numeric_performance/format_performance.html

It will take some learning, but you will be rewarded with significant speedup.

qt How to optimize ASCII output with QTextStream?, QFile outfile(outpath); if (!outfile.open(QIODevice::WriteOnly | QIODevice::Text | QIODevice::Truncate)) { qWarning("Could not open ASCII for writing!"); return� QTextStream does not write a BOM by default, but you can enable this by calling setGenerateByteOrderMark (true). When QTextStream operates on a QString directly, the codec is disabled. There are three general ways to use QTextStream when reading text files: Chunk by chunk, by calling readLine() or readAll(). Word by word.

Use multiple cores (if available)! It seems to me that each point of your data is independent of the others. So you could split up the preprocessing using QtConcurrent::mappedReduced. e.g.:

  1. divide your data into a sequence of blocks consisting of N (e.g. 1000) points each,
  2. then let your mapFunction process each block into a memory buffer
  3. let the reduceFunction write the buffers to the file.

Use OrderedReduce | SequentialReduce as options.

This can be used in addition to the other optimizations!

QTextStream Class, QTextStream(QString *string, QIODevice::OpenMode openMode = QIODevice:: ReadWrite) QFile data("output.txt"); if (data.open(QFile::WriteOnly | QFile:: Truncate)) { QTextStream Converts c from ASCII to a QChar, then writes it to the stream. This will disable the auto-detection, and speed up QTextStream slightly . QTextStream & QTextStream:: operator<< (const QByteArray & array) This is an overloaded function. Writes array to the stream. The contents of array are converted with QString::fromAscii(). QTextStream & QTextStream:: operator<< (const char * string) This is an overloaded function. Writes the constant string pointed to by string to the stream.

If you don't have a proper profiler, but a debugger which allows you to break the running application, manual profiling is an option: - start the app in your debugger, call the slow code part - break the execution randomly while executing the slow part - look at the call stack and note which subroutine was active - repeat several times (about 10x or so)

Now the probability is high that you found the same procedure in the majority of cases - that's the one which you have to avoid / make faster in order to improve things

QTextStream Class, QTextStream(QString * string, QIODevice::OpenMode openMode = QIODevice:: ReadWrite) QFile data("output.txt"); if (data.open(QFile::WriteOnly | QFile:: Truncate)) { QTextStream Converts c from ASCII to a QChar, then writes it to the stream. This will disable the auto-detection, and speed up QTextStream slightly . QTextStream can operate on a QIODevice, a QByteArray or a QString. Using QTextStream's streaming operators, you can conveniently read and write words, lines and numbers. For generating text, QTextStream supports formatting options for field padding and alignment, and formatting of numbers. Example:

Here I rewrote your piece of code using the standard C library - maybe that's faster. I didn't test, so you may need to read some fprintf format specification documentation - depending on your compiler format flags may be different.

Take care with the return type of your getTriple() function - if it's not float you must change the %f's in the preceeding format specification.

#include <stdio.h>

FILE* out;

out = fopen(outpath, "w");
if (out == NULL)
{
    qWarning("Could not open ASCII for writing!");
    return false;
} else {
    /* compute XYZ precision */
    int prec[3] = {0, 0, 0}; //these non-zero values are determined programmatically

    /* set up the writer */
    char del = config.delimiter;

    char s[255];        // or more if needed..
    /* write the header line */
    sprintf(s, "X%cY%cZ%c", del, del, del);
    fputs(s, out);
    if(config.fields & INTFIELD)
        fputs("IntegerField", out);
    if(config.fields & DBLFIELD)
        fputs("DoubleField", out);
    if(config.fields & INTFIELD2)
        fputs("IntegerField2", out);
    if(config.fields & TRIPLEFIELD) {
        sprintf(s, "%cTri1%cTri2%cTri3", del, del, del);
        fputs(s, out);
    }
    fputs("\n", out);

    /* write out the points */
    for(quint64 ptnum = 0; ptnum < numpoints; ++ptnum)
    {
        pt = points.at(ptnum);
        sprintf(s, "%.*f%c%.*f%c%.*f%c", prec[0], pt->getXYZ(0), del, prec[1], pt->getXYZ(1), del, prec[2], pt->getXYZ(2), del);
        fputs(s, out);            
        if(config.fields & INTFIELD)
            sprintf(s, "%d", pt->getIntValue());
        if(config.fields & DBLFIELD)
            sprintf(s, "%f", pt->getDoubleValue());
        if(config.fields & INTFIELD2)
            sprintf(s, "%d", pt->getIntValue2());
        fputs(s, out);
        if(config.fields & TRIPLEFIELD)
        {
            sprintf(s, "%c%f%c%f%c%f", del, pt->getTriple(0), del, pt->getTriple(1), del, pt->getTriple(2));    // assuming the getTriples() return double - need to adjust the %f to the real type
            fputs(s, out);
        }
        fputs("\n", out);
    } //end for every point
    fclose(out);
}

Reading and Writing Text, Reading and Writing Text / Input/Output from C++ GUI Programming with Qt 3. We can use QTextStream for reading and writing plain text files or files of other Then we set the encoding to UTF-8, an ASCII-compatible encoding that can to Use Lean Speed and Six Sigma Quality to Improve Services and Transactions. The output ASCII file has one line per record, so I've tried using a template QString constructed with arg, but that just slowed things down about 15%. A faster solution doesn't have to use QTextStream, or necessarily write directly to the file, but the output is too large to write the whole thing to memory.

QTextStream Class Reference, The QTextStream class provides a convenient interface for reading and writing text. More #include QTextStream ( QString * string, QIODevice::OpenMode openMode It's also common to use QTextStream to read console input and write console output. Converts c from ASCII to a QChar, then writes it to the stream. List of All Members for QTextStream. This is the complete list of members for QTextStream, including inherited members.

qtextstream.cpp source code [qtbase/src/corelib/serialization , 55, QTextStream can operate on a QIODevice, a QByteArray or a. 56, QString. 1126, FILE based input and output streams: stdin, stdout and stderr. Example: 1127 2129, explicitly. This will disable the auto-detection, and speed up 2392, Converts \a c from ASCII to a QChar, then writes it to the stream. 2393, */. Output string with QTextStream. #include <QTextStream> #include <QString> int main() { const char* charstr = "this is one very long string"; QTextStream cout(stdout

C++ GUI Programming with Qt4: Input/Output, In addition to characters and strings, QTextStream supports C++'s basic is a popular ASCII-compatible encoding that can represent the entire� Internally, QTextStream uses a Unicode based buffer, and QTextCodec is used by QTextStream to automatically support different character sets. By default, QTextCodec.codecForLocale () is used for reading and writing, but you can also set the codec by calling setCodec ().

Comments
  • You need to profile your app to find out what slows down it. Qt classes are already optimized and work fast if properly used. And your code is correct, I see nothing obviously slow. Profiling is really required in your case. Maybe your disk is slow, maybe it's QTextStream, maybe it's QString.
  • I have yet to find a good Qt profiler for 64bit windows. VerySleepy has potential, but its output is so arcane I hardly understand it. Suggestions?
  • It depends on which compiler you're using. I like gprof for gcc. For MSVC compiler the standard vsprofiler can be used.
  • This is a good suggestion. Unfortunately, my standard C++ stream usage is pretty rusty at this point. Can you provide a little guidance on the code, given my sample code?
  • @Phlucious It just occurred to me that you don't need to give up on QFile. You only need to replace QTextStream with std::ostringstream in order avoid needless text encoding/decoding. I updated the answer.
  • I finally had time to give this a test. Unfortunately, every test I ran using some combination of the std streams and char* inevitably resulted in a slowdown of as much as 80%. (Yes, I made sure I was using \n instead of endl.) I ended up sticking with the original QTextStream implementation in the end.
  • @Phlucious Holy crap, two and a half years later :-D Anyway, your last bet is using fprintf(). That should be as fast as it gets without resorting to doing kernel calls in assembly...
  • I tend to shy away from the Boost library because I already have a strong dependency on Qt and I think it's useful to avoid numerous dependencies. Still, this looks interesting. Thanks!
  • I ended up doing this using chunks of around 100k points each. mapFunction writes into a memory buffer that the reduceFunction writes to disk. On 6 threads (4 cores) I got an approximately 50% speed boost.
  • @Phlucious: Great. The result seems plausible because the actual writing to file is not parallel and depends on the output media.