From the course: Scala Essential Training for Data Science

Unlock this course with a free trial

Join today to access over 24,600 courses taught by industry experts.

Solution: Functions over DataFrames

Solution: Functions over DataFrames

(upbeat music) - [Instructor] Here is the solution to the challenge. Here's the command for starting Docker. We're going to use docker run. We're going to pass in a volume mount parameter -v. And we're going to map. In my case, I stored the sales.csv file in my temp directory. And I'm going to map that to a container directory called data. I want to use an interactive session, so I'm passing in the -it parameter. I'm starting the Apache Spark container. And when that container is all loaded, the initial command to run is spark-shell. So we'll do that. And now I'm going to import my implicits. And the command for loading the contents of sales.csv into a data frame is this. We're going to create a value called salesDF, which will be a data frame. And we're going to specify the spark.read with the options where the header is true, where we will infer the Schema, and we will use csv to load it. And the file that we're going to load is sales.csv from the directory called data. And once we…

Contents