I have a file of X number of lines with this format:
1::F::1::10::48067
What I am looking for is that it reads as many lines as I pass to the following function:
case class User(userID: String, Gender: String, Occupation: String, Zipcode:
String)
def getNthLineUser(lines: Array[String]) = {
val rddUsersSplit = lines.foreach(e => println(e.split("::")(0), e.split("::")(1), e.split("::")(2), e.split("::")(3)))
}
getNthLineUser(RDDusers.take(2))
What I want to do with this function is to read each of the lines and split "1::F::1::10::48067" by "::" and store this in a sequence of [User ] to later convert it to DF with the .toDF
.
I don't see how to convert the output which is [Unit] to Sequence or List. What I have managed to do is simply read each line and separate it by "::" and print it to the screen.
How can I do it?
The use of
foreach
only looks for some collateral effect, not the transformation of a data flow. In Functional Programming it should always be avoided. Use any other functional method, such asmap
: