Count characters in a piece of text as a one-liner using Java 8 Streaming API
A WordCount app is arguably the most commonly used example of the MapReduce principle – see Hadoop or Spark tutorials. I wanted to do a similar thing in plain old Java as “a one-liner” (kind of). My objective was to explore the power and expressiveness of the Streaming API and Lambda expressions.
Here is what I wanted to achieve:
Input: "Hello world!" Result: ' ': 1 '!': 1 'd': 1 'e': 1 'h': 1 'l': 3 'o': 2 'r': 1 'w': 1
Essentially, the program takes the input text, makes all letters lowercase, counts occurrences of each and every character and prints a sorted array of unique characters along with their counts.
Without further ado, here is how I went about solving the problem:
import java.util.List; import java.util.stream.Collectors; public class CharCounter { public List<Count> count(String text) { return text .toLowerCase() .chars() .distinct() .mapToObj(i -> new Count((char) i, text .toLowerCase() .chars() .filter(j -> j == i).count())) .sorted() .collect(Collectors.toList()); } // Java (still!) lacks Tuples, so I created my own. static class Count implements Comparable<Count> { char key; // a character long count; // the character count Count(char key, long count) {..} @Override String toString() {..} // pretty print @Override int compareTo(Count o) { return this.key - o.key; } } public static void main(String[] args) { new CharCounter() .count("Hello World!") .stream() // Yeah, it's a list and thus 'streamable' .forEach(System.out::println); } }
Hope it makes sense. You might argue that I could have used something like a Map.Entry instead of a custom Count object. However, I find it much cleaner that way. The complete example is available on GitHub.
Thanks for reading and definitely let me know your thoughts on how this could be done differently.