Read lines of characters and get file position

java read specific line from file
java read text file line by line with delimiter
read string line by line java
read file from specific position java
java read file line by line
read a file line by line
how to read multiple lines from a file in java
bufferedreader read all lines

I'm reading sequential lines of characters from a text file. The encoding of the characters in the file might not be single-byte.

At certain points, I'd like to get the file position at which the next line starts, so that I can re-open the file later and return to that position quickly.

Questions

Is there an easy way to do both, preferably using standard Java libraries?

If not, what is a reasonable workaround?

Attributes of an ideal solution

An ideal solution would handle multiple character encodings. This includes UTF-8, in which different characters may be represented by different numbers of bytes. An ideal solution would rely mostly on a trusted, well-supported library. Most ideal would be the standard Java library. Second best would be an Apache or Google library. The solution must be scalable. Reading the entire file into memory is not a solution. Returning to a position should not require reading all prior characters in linear time.

Details

For the first requirement, BufferedReader.readLine() is attractive. But buffering clearly interferes with getting a meaningful file position.

Less obviously, InputStreamReader also can read ahead, interfering with getting the file position. From the InputStreamReader documentation:

To enable the efficient conversion of bytes to characters, more bytes may be read ahead from the underlying stream than are necessary to satisfy the current read operation.

The method RandomAccessFile.readLine() reads a single byte per character.

Each byte is converted into a character by taking the byte's value for the lower eight bits of the character and setting the high eight bits of the character to zero. This method does not, therefore, support the full Unicode character set.


If you construct a BufferedReader from a FileReader and keep an instance of the FileReader accessible to your code, you should be able to get the position of the next line by calling:

fileReader.getChannel().position();

after a call to bufferedReader.readLine().

The BufferedReader could be constructed with an input buffer of size 1 if you're willing to trade performance gains for positional precision.

Alternate Solution What would be wrong with keeping track of the bytes yourself:

long startingPoint = 0; // or starting position if this file has been previously processed

while (readingLines) {
    String line = bufferedReader.readLine();
    startingPoint += line.getBytes().length;
}

this would give you the byte count accurate to what you've already processed, regardless of underlying marking or buffering. You'd have to account for line endings in your tally, since they are stripped.

Reading a File Line by Line in Java, Being able to read a file line by line gives us the ability to seek only the into tokens using a delimiter pattern, which in our case is the newline character: Don't get confused as the null isn't equal to an empty line and the file will be read until  Batch File to read 'N' characters from a text file. I have searched this across the net and found many codes for retrieving the entire line from a text or replacing the text with another but not for what i was looking for. Using the For loop with the tokens would return on the set (word) separated with spaces.


Read a specific line from a file, Demonstrate how to obtain the contents of a specific line within a file. int read_file_line(const char *path, int line_no) It only provides a function to read one line from a file from the current position in the input channel  When you iterate over a file, you get the lines from that file. tuple can take an iterator and instantiate a tuple instance for you from the iterator that you give it. lines is a tuple created from the lines of the file.


The case seems to be solved by VTD-XML, a library able to quickly parse big XML files:

The last java VTD-XML ximpleware implementation, currently 2.13 http://sourceforge.net/projects/vtd-xml/files/vtd-xml/ provides some code maintaning a byte offset after each call to the getChar() method of its IReader implementations.

IReader implementations for various caracter encodings are available inside VTDGen.java and VTDGenHuge.java

IReader implementations are provided for the following encodings

ASCII; ISO_8859_1 ISO_8859_10 ISO_8859_11 ISO_8859_12 ISO_8859_13 ISO_8859_14 ISO_8859_15 ISO_8859_16 ISO_8859_2 ISO_8859_3 ISO_8859_4 ISO_8859_5 ISO_8859_6 ISO_8859_7 ISO_8859_8 ISO_8859_9 UTF_16BE UTF_16LE UTF8; WIN_1250 WIN_1251 WIN_1252 WIN_1253 WIN_1254 WIN_1255 WIN_1256 WIN_1257 WIN_1258

Read a file line by line, Read a file one line at a time, as opposed to reading the entire file at once. This program uses OS QSAM I/O macros (OPEN,CLOSE,GET,PUT,DCB). ldrb w0,[​x11,x15] // load 1 character read buffer Depending on the parameters it opens, closes, reads, writes a file or reads or sets the file position. I'm reading sequential lines of characters from a text file. The encoding of the characters in the file might not be single-byte. The encoding of the characters in the file might not be single-byte. At certain points, I'd like to get the file position at which the next line starts, so that I can re-open the file later and return to that position quickly .


I would suggest java.io.LineNumberReader. You can set and get the line number and therefore continue at a certain line index.

Since it is a BufferedReader it is also capable of handling UTF-8.

StreamReader.ReadLine Method (System.IO), ReadLine. If the current method throws an OutOfMemoryException, the reader's position in the underlying Stream object is advanced by the number of characters​  The Get-Content cmdlet gets the content of the item at the location specified by the path, such as the text in a file or the content of a function. For files, the content is read one line at a time and returns a collection of objects, each of which represents a line of content.


Solution A

  1. Use RandomAccessFile.readChar() or RandomAccessFile.readByte() in a loop.
  2. Check for your EOL characters, then process that line.

The problem with anything else is that you would have to absolutely make sure you never read past the EOL character.

readChar() returns a char not a byte. So you do not have to worry about character width.

Reads a character from this file. This method reads two bytes from the file, starting at the current file pointer.

[...]

This method blocks until the two bytes are read, the end of the stream is detected, or an exception is thrown.

By using a RandomAccessFile and not a Reader you are giving up Java's ability to decode the charset in the file for you. A BufferedReader would do so automatically.

There are several ways of over coming this. One is to detect the encoding yourself and then use the correct read*() method. The other way would be to use a BoundedInput stream.

There is one in this question Java: reading strings from a random access file with buffered input

E.g. https://stackoverflow.com/a/4305478/16549

How to read a file line by line in Java (Example), A good example is reading a CSV file line by line and then splitting the in the input of this scanner without advancing the file read position. to read characters​, arrays, and lines from a character-input stream. try { // initialize lines stream Stream<String> stream = Files.lines(Paths.get("examplefile.txt"));  offset − This is the position of the read/write pointer within the file. whence − This is optional and defaults to 0 which means absolute file positioning, other values are 1 which means seek relative to the current position and 2 means seek relative to the file's end. Return Value. This method does not return any value. Example. The following example shows the usage of seek() method. Python is a great language Python is a great language


Reading text files in C# - StreamReader, FileStream, Reading text files in C# shows how to read text files in C#. ReadAllText() method opens a text file, reads all lines of the file into a string, and method reads all characters from the current position of the stream to its end. Hi gurus, I am trying to figure out how to extract substring from file line (all lines in file), as specified position and specified legth. Example input (file lines) pre { overflow:scroll; margin:2px | The UNIX and Linux Forums


readLines: Read Text Lines from a Connection, If the con is a character string, the function calls file to obtain a file connection which is opened for If the connection is open it is read from its current position. Replace characters in a text file at a certain position Hey guys! i have a text file and want to write a textstring at, let's say, line 5 at position 4, and REPLACE the old text at that position with the new text-


How to read a file properly in Python, Python tell() Method Python seek() method Splitting lines from a text file in read​([number]) : Return specified number of characters from the file. if omitted it will str = my_file.readline() print(str) # Get the current cursor position of the file. pnt