How to read whole file in one string

java read file to string
ioutils read file to string
how to read whole file in string in java
java read file to string java 8
read file as string java 7
java read ascii file to string
java read file to string stack overflow
read entire file into a string

I want to read json or xml file in pyspark.lf my file is split in multiple line in

rdd= sc.textFIle(json or xml) 


" employees":

Input is spread across multiple lines.

Expected Output {"employees:[{"firstName:"John",......]}

How to get the complete file in a single line using pyspark?

Please help me I am new to spark.

There are 3 ways (I invented the 3rd one, the first two are standard built-in Spark functions), solutions here are in PySpark:

textFile, wholeTextFile, and a labeled textFile (key = file, value = 1 line from file. This is kind of a mix between the two given ways to parse files).

1.) textFile

input: rdd = sc.textFile('/home/folder_with_text_files/input_file')

output: array containing 1 line of file as each entry ie. [line1, line2, ...]

2.) wholeTextFiles

input: rdd = sc.wholeTextFiles('/home/folder_with_text_files/*')

output: array of tuples, first item is the "key" with the filepath, second item contains 1 file's entire contents ie.

[(u'file:/home/folder_with_text_files/', u'file1_contents'), (u'file:/home/folder_with_text_files/', file2_contents), ...]

3.) "Labeled" textFile


import glob
from pyspark import SparkContext
sc = SparkContext("local","example") # if running locally
sqlContext = SQLContext(sc)

for filename in glob.glob(Data_File + "/*"):
    Spark_Full += sc.textFile(filename).keyBy(lambda x: filename)

output: array with each entry containing a tuple using filename-as-key with value = each line of file. (Technically, using this method you can also use a different key besides the actual filepath name- perhaps a hashing representation to save on memory). ie.

[('/home/folder_with_text_files/file1.txt', 'file1_contents_line1'),
 ('/home/folder_with_text_files/file1.txt', 'file1_contents_line2'),
 ('/home/folder_with_text_files/file1.txt', 'file1_contents_line3'),
 ('/home/folder_with_text_files/file2.txt', 'file2_contents_line1'),

You can also recombine either as a list of lines:

Spark_Full.groupByKey().map(lambda x: (x[0], list(x[1]))).collect()

[('/home/folder_with_text_files/file1.txt', ['file1_contents_line1', 'file1_contents_line2','file1_contents_line3']),
 ('/home/folder_with_text_files/file2.txt', ['file2_contents_line1'])]

Or recombine entire files back to single strings (in this example the result is the same as what you get from wholeTextFiles, but with the string "file:" stripped from the filepathing.):

Spark_Full.groupByKey().map(lambda x: (x[0], ' '.join(list(x[1])))).collect()

reading whole text file into one string - C++ Forum, reading whole text file into one string ? Mar 28, 2017 at 11:18am. For the noobs out there who find this stuff fun and interesting, the fastest way to read an entire file into a string in most cases (according to these benchmarks) is by the following: using (StreamReader sr = File.OpenText(fileName)) { string s = sr.ReadToEnd(); } //you then have to process the string

This is how you would do in scala

rdd = sc.wholeTextFiles("hdfs://nameservice1/user/me/test.txt")

How to read entire text file in Python?, Often one might need to read the entire content of a text file (or flat file) at once in python. In this post, we showed an example of reading the  String toString (File, Charset) Using the given character set, reads all characters from a file into a String. List<String> readLines (File, Charset) reads all of the lines from a file into a List<String>, one entry per line.

If your data is not formed on one line as textFile expects, then use wholeTextFiles.

This will give you the whole file so that you can parse it down into whatever format you would like.

Java Read File to String [Updated for Java 8], Read Entire File to String – Files.readAllBytes() [≥ Java 7]. readAllBytes() method reads all the bytes from a file. The method ensures that the file is closed when all  You do not need to read file into a buffer to compare with a string. It is better to do it on the fly. 2nd, be careful of encodings. On Windows, there are some ridiculous popular encodings, such as UTF-16. – Pavel Radzivilovsky Dec 22 '12 at 12:57

"How to read whole [HDFS] file in one string [in Spark, to use as sql]":


// Put file to hdfs from edge-node's shell...

hdfs dfs -put <filename>

// Within spark-shell...

// 1. Load file as one string
val f = sc.wholeTextFiles("hdfs:///user/<username>/<filename>")
val hql = f.take(1)(0)._2

// 2. Use string as sql/hql
val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
val results = hiveContext.sql(hql)

Reading from a File in Kotlin, We'll cover both use cases of reading the entire file as a String, as well as reading it into a list of individual lines. Also obtaining it from a full  To read from a text file that is encoded. Use the ReadAllText method of the My.Computer.FileSystem object to read the contents of a text file into a string, supplying the path and file encoding type. The following example reads the contents of the UTF32 file test.txt into a string and then displays it in a message box.

How do I read and write files in Tcl, For the simple task of reading a whole file and splitting it into a list, there is a critcl If you always treat the data as a string then you don't have to worry about  Read Entire File to String – Files.readAllBytes() [≥ Java 7] readAllBytes() method reads all the bytes from a file. The method ensures that the file is closed when all bytes have been read or an I/O error, or other runtime exception, is thrown. After reading all bytes, we pass those bytes to String class constructor to create a string.

Read a File Line-by-Line in Python, The read method will read in all the data into one text string. This is useful for smaller files where you would like to do text manipulation on the entire file,  Dim readText As String = File.ReadAllText(path) Console.WriteLine(readText) End Sub End Class Remarks. This method opens a file, reads all the text in the file, and returns it as a string. It then closes the file. This method attempts to automatically detect the encoding of a file based on the presence of byte order marks.

Read contents of file as text - MATLAB fileread, Name of the file to read, specified as a character vector or string scalar. fileread leverages automatic character set detection to determine the file encoding. The File.ReadAllText function reads the entire text contents into a String. For simplicity, this is the best method to use. More advanced methods can be used to read files in line-by-line, which is more efficient with large files.