scala notes -- use WordCount word statistics to explain the functions and differences of map, flatMap and groupBy functions

Using scala implementation, count the number of each word in the file

import scala.io.Source

object demo14 {
  def main(args: Array[String]): Unit = {
    //read file
    val source = Source.fromFile("Scala/data/words.txt")
    
    //Put the read data into a list collection of List[String] type, and each line is used as an element of the list
    val list: List[String] = source.getLines().toList
    
    //Through flatMap and split, each element, that is, the word of each line in the file, is separated and put into a new List collection
    val wordList: List[String] = list.flatMap(line => {
      line.split(",")
    })
    
    //Through the groupBy function, take each word as a key, build all the same words into a list set, and then put the key and list set into the map set as key value pairs
    val wordsMap: Map[String, List[String]] = wordList.groupBy(w => w)
    
    //Use the map function to process each element, that is, each key value pair, and calculate the length of the list set to obtain the number of the same words
    wordsMap.map({
      case (word:String,value:List[String]) => {
          word + "," + value.length
    }
    }).foreach(println)


    source.close()
  }
}

The result is

There are three important functions used, map function, flatMap function and groupBy function
Briefly, let's talk about the function of each function

map function

Map function, map(func), receives a function in parentheses. For convenience of description, it is named func. The parameter type of func function is consistent with the type of elements in the original set (i.e. the set to be operated by map function). The return value of the function is determined by itself. Usually, the last line of the function body is used as the return value

The map function will apply the func function to each element of the original set, and the mapping will produce a new element. Then put each new element into a new set with the same type and size as the original set

Take a simple example

val list:List[Int] =List[Int](1,2,3,4,5)
val list_map: List[Int] = list.map(elem => {
    elem + 1
})

The type of the original collection is List[Int], so the result returned by the map function is also List[Int]
The data type of the elements of the original set is Int, so the parameter type of the func function passed into the map is Int. here is a shorthand for the anonymous function. Elem is actually (elem: Int)

flatMap function

flatMap (func). The flatMap function also receives a func function. Like the map function, this func function will be applied to each element of the original set

The difference is that when flatMap applies a function to the elements in the collection, For each element, a new set (instead of an element) will be returned. Then, flatMap will "flatten" (or "merge" or "connect") the generated multiple sets into a new set and return. The returned set is the same as the original set, but the size may be different, and the types of elements may also be different.

Take the above code as an example

val wordList: List[String] = list.flatMap(line => {
      line.split(",")
    })

The original set is a list of type List[String], in which each element is a line of text content
The function body of func function here is line Split (","), because line is an element of the original collection and a string, and the content of the string is "java,python,scala", you can call the split() method here, and the split() method will return an array. Therefore, the actual return value of func function here is an array of type Array[String]

Array is also a kind of set. flatMap will process each set generated by func function and flatten it into a one-dimensional list set
Therefore, the return result of the flatMap function is a list of type List[String]

groupBy function

Grouping for a grouping field, the returned result is a Map collection
The key value is a grouping field, and the data type is consistent with that of the grouping field
The value value is a List set, which contains all the contents of the group

The grouping field is usually an element of the original set or a part of this element
The whole grouping is the elements that match the grouping field in all the elements of the original set. These elements will enter the same List set

Also take the above code as an example

 val wordsMap: Map[String, List[String]] = wordList.groupBy(w => w)

W = > W means that elements are grouped according to their own grouping fields
After flatMap, an element in the original set is a word, such as "java"
At this time, group according to this word, and the result is a Map collection

Finally, pass each map set into the map, apply pattern matching to each element, calculate the length of each List set, and finally output the length of key and value of the map set to get the number of each word

... let's write about it. Some of the above places may not be clear enough. I wanted to write in more detail, but when I thought that my article had been directly transported and copied, and it was useless to report it, I felt I didn't want to write again. Alas

Keywords: Scala map

Added by rross46 on Mon, 03 Jan 2022 06:41:20 +0200