Streams
Introduction
Collections, lists or sets are part of the everyday work of a Java programmer. There is no software that is implemented without one of these data structures. The classic way to deal with these data structures are loops.
Unfortunately, loops have the disadvantage that they make the code a little confusing. The programmer has to analyze the source code line by line to understand what has been implemented.
The Streams API was introduced with Java 8. Streams allow the programmer to write source code that is readable and thus also more maintainable.
A stream is a sequence of objects on which certain methods can be executed. We will get to know which method this is exactly below.
Characteristics of Streams
- Streams are not related to InputStreams, OutputStreams, etc.
- Streams are NOT data structures but are wrappers around Collection that carry values from a source through a pipeline of operations.
- Streams are more powerful, faster and more memory efficient than Lists
- Streams are designed for lambdas
- Streams can easily be output as arrays or lists
- Streams employ lazy evaluation
- Streams are parallelizable
- Streams can be “on-the-fly”
Creating Streams
a stream can be created by using one of the following ways:
- From individual values
Stream.of(val1, val2, …)
- From array
Stream.of(someArray)
Arrays.stream(someArray)
- From List (and other Collections)
someList.stream()
someOtherCollection.stream()
ParallelStreams
In addition to the classic streams, Java 8 also offers the parallel streams. As the name suggests, ParallelStreams are streams that are processed in parallel. This has the great advantage that you don't have to worry about multithreading yourself. The Stream API does this for us.
The following source code shows the use of Streams and ParallelStreams.
List list = Arrays.asList("Customer 1", "Customer 2", "Customer 3", "Customer 4", "Customer 5");
list.stream().forEach(s -> System.out.println("Stream: " + s + " - " + Thread.currentThread()));
list.parallelStream().forEach(s -> System.out.println("ParallelStream: " + s + " - " + Thread.currentThread()));
In the following comes another example to show how to use ParallelStreams for more performance.
private static void timingTest(Stream<Employee> testStream) {
long startTime = System.nanoTime();
testStream.forEach(e -> doSlowOp());
long endTime = System.nanoTime();
System.out.printf(" %.3f seconds.%n",
deltaSeconds(startTime, endTime));
}
private static double deltaSeconds(long startTime, long endTime) {
return((endTime - startTime) / 1000000000);
}
void doSlowOp() {
try {
TimeUnit.SECONDS.sleep(1);
} catch (InterruptedException ie) {
// Nothing to do here.
}
}
void main() {
System.out.print("Serial version [11 entries]:");
timingTest(googlers());
int numProcessorsOrCores =
Runtime.getRuntime().availableProcessors();
System.out.printf("Parallel version on %s-core machine:",
numProcessorsOrCores);
timingTest(googlers().parallel() );
}
The results of the doSlowOp
functionality for the Serial version (11 entries) is about 11.000 seconds; and for the parallel version on 4-core machine is about 3.000 seconds.
Common Functional Interfaces
Predicate<T>
- Represents a predicate (boolean-valued function) of one argument
- Functional method is boolean Test(T t)
- Evaluates this Predicate on the given input argument (T t)
- Returns true if the input argument matches the predicate, otherwise false
Supplier<T>
- Represents a supplier of results
- Functional method is T get()
- Returns a result of type T
Function<T,R>
- Represents a function that accepts one argument and produces a result
- Functional method is R apply(T t)
- Applies this function to the given argument (T t)
- Returns the function result
Consumer<T>
- Represents an operation that accepts a single input and returns no result
- Functional method is void accept(T t)
- Performs this operation on the given argument (T t)
UnaryOperator<T>
- Represents an operation on a single operands that produces a result of the same type as its operand
- Functional method is R Function.apply(T t)
- Applies this function to the given argument (T t)
- Returns the function result
BiFunction<T,U,R>
- Represents an operation that accepts two arguments and produces a result
- Functional method is R apply(T t, U u)
- Applies this function to the given arguments (T t, U u)
- Returns the function result
BinaryOperator<T>
- Extends BiFunction
- Represents an operation upon two operands of the same type, producing a result of the same type as the operands
- Functional method is R BiFunction.apply(T t, U u)
- Applies this function to the given arguments (T t, U u) where R,T and U are of the same type
- Returns the function result
Comparator<T>
- Compares its two arguments for order.
- Functional method is int compareTo(T o1, T o2)
- Returns a negative integer, zero, or a positive integer as the first argument is less than, equal to, or greater than the second.
Stream Pipeline
A Stream is processed through a pipeline of operations and starts with a source data structure.
Intermediate methods are performed on the Stream elements. These methods produce Streams and are not processed until the terminal method is called.
The Stream is considered consumed when a terminal operation is invoked. No other operation can be performed on the Stream elements afterwards.
A Stream pipeline contains some short-circuit methods (which could be intermediate or terminal methods) that cause the earlier intermediate methods to be processed only until the short-circuit method can be evaluated.
The most important common methods of streams are:
- Intermediate Methods
map
,filter
,distinct
,sorted
,peek
,limit
,parallel
- Terminal Methods
forEach
,toArray
,reduce
,collect
,min
,max
,count
,anyMatch
,allMatch
,noneMatch
,findFirst
,findAny
,iterator
- Short-circuit Methods
anyMatch
,allMatch
,noneMatch
,findFirst
,findAny,limit
Optional<T> Class
A container which may or may not contain a non-null value. Its common methods are: * isPresent() – returns true if value is present * Get() – returns value if present * orElse(T other) – returns value if present, or other * ifPresent(Consumer) – runs the lambda if value is present
Common Stream API Methods
Void forEach(Consumer)
It is an easy way to loop over Stream elements. You supply a lambda for forEach and that lambda is called on each element of the Stream. Related peek method does the exact same thing, but returns the original Stream.
The same code using the loop:
List<Employee> employees = getEmployees();
for(Employee e: employees) {
e.setSalary(e.getSalary() * 11/10);
}
Stream<T> map(Function)
map
produces a new Stream that is the result of applying a Function to each element of original Stream
For example:
Stream<T> filter(Predicate)
filter
produces a new Stream that contains only the elements of the original Stream that pass a given test
For example:
Optional<T> findFirst()
findFirst
returns an Optional for the first entry in the Stream.
For example:
employees.filter(…).findFirst().orElse(Consultant)
// Get the first Employee entry that passes the filter
Object[] toArray(Supplier)
toArray
reads the Stream of elements into a an array.
For example:
Employee[] empArray = employees.toArray(Employee[]::new);
// Create an array of Employees out of the Stream of Employees
List<T> collect(Collectors.toList())
collect
reads the Stream of elements into a List or any other collection.
For example:
List<Employee> empList =
employees.collect(Collectors.toList());
// Create a List of Employees out of the Stream of Employees
collect partitioningBy
You provide a Predicate. It builds a Map where true maps to a List of entries that passed the Predicate, and false maps to a List that failed the Predicate. For example:
Map<Boolean,List<Employee>> richTable =
googlers().collect(partitioningBy(e -> e.getSalary() > 1000000));
collect groupingBy
You provide a Function. It builds a Map where each output value of the Function maps to a List of entries that gave that value. For example:
Map<Department,List<Employee>> deptTable =
employeeStream().collect(groupingBy(Employee::getDepartment));
T reduce(T identity, BinaryOperator)
You start with a seed (identity) value, then combine this value with the first Entry in the Stream, combine the second entry of the Stream, etc. For example:
IntStream (Stream on primative int) has build-insum()
, Min
and Max
methods.
Stream<T> limit(long maxSize)
Limit(n)
returns a stream of the first n elements.
For example:
Stream<T> skip(long n)
skip(n)
returns a stream starting with element n.
For example:
Stream<T> sorted(Comparator)
Returns a stream consisting of the elements of this stream, sorted according to the provided Comparator For example:
empStream.map(…).filter(…).limit(…).sorted((e1, e2) -> e1.getSalary() - e2.getSalary())
// Employees sorted by salary
Optional<T> min(Comparator)
Returns the minimum element in this Stream according to the Comparator For example:
Employee alphabeticallyFirst =
ids.stream().map(EmployeeSamples::findGoogler)
.min((e1, e2) -> e1.getLastName().
compareTo(e2.getLastName())).get();
// Get Googler with earliest lastName
Optional<T> max(Comparator)
Returns the minimum element in this Stream according to the Comparator. For example:
Employee richest =
ids.stream().map(EmployeeSamples::findGoogler)
.max((e1, e2) -> e1.getSalary() - e2.getSalary()).get();
// Get Richest Employee
Stream<T> distinct()
Returns a stream consisting of the distinct elements of this stream. For example:
List<Integer> ids2 =
Arrays.asList(9, 10, 9, 10, 9, 10);
List<Employee> emps4 =
ids2.stream().map(EmployeeSamples::findGoogler)
.distinct().collect(toList());
// Get a list of distinct Employees
long count()
Returns the count of elements in the Stream. For example:
Boolean anyMatch(Predicate)
anyMatch
returns true if Stream passes, false otherwise. It processes elements in the Stream one element at a time until it finds a match according to the Predicate and returns true if it found a match.
Boolean allMatch(Predicate)
allMatch
returns true if Stream passes, false otherwise. It processes elements in the Stream one element at a time until it fails a match according to the Predicate and returns false if an element failed the Predicate.
Boolean noneMatch(Predicate)
noneMatch
returns true if Stream passes, false otherwise. It processes elements in the Stream one element at a time until it finds a match according to the Predicate and returns false if an element matches the Predicate.
For example:
employeeStream.anyMatch(e -> e.getSalary() > 500000)
// Is there a rich Employee among all Employees?
Filter-Map-Reduce has some analogies to Google's MapReduce algorithm, but is not the same (See also Wikipedia).
Filter-Map-Reduce describes a pattern in which a quantity of data is processed in a sequence of specific steps:
* Filter: filtering of desired elements from a set of elements. The element types remain the same, but the number of elements is reduced. Example: Stream.filter (Predicate)
.
* Map: transformation of the elements. The number of elements does not change, but the element type often changes. For example, calculations, extractions or conversions can be carried out. Example: Stream.map (Function)
.
* Reduce: reduction to an end result. Examples: Stream.forEach (Consumer)
, Stream.collect (Collector)
and Stream.reduce (BinaryOperator)
.
Examples for Filter-Map-Reduce (display the result with System.out.println(...)
):
Initialization
class book {
String title; String author; int year; double price;
public book (string title, string author, int year, double price) {
this.titel = title; this.autor = author;
this.year = year; this.price = price;
}
}
List <Book> myBookList = Arrays.asList (
new book ("Fortran", "Ferdinand", 1957, 57.99),
new book ("Java in 3 Tage", "Anton", 2005, 11.99),
new book ("Java in 4 Tage", "Berta", 2005, 22.99),
new book ("Filter-Map-Reduce with Lambdas", "Caesar", 2014, 33.99));
- Filter certain book objects, e.g. all from 2004,
- folder or extract the desired information, e.g. the author name,
- Reduce to a result string, e.g. Comma-separated names.
String s = myBookList.stream ().filter (b -> b.year> = 2004)
.map (b -> b.autor).reduce ("", (s1, s2) -> (s1.isEmpty ())? s2: s1 + "," + s2);
Grouping, e.g. by year, as well as counting per group.
Map <Integer, Long> m = myBookList.stream ().filter (b -> b.year> = 2004)
.collect (Collectors.groupingBy (b -> Integer.valueOf (b.jahr),Collectors.counting ()));
Reduce with the aggregate function, e.g. on the average.
OptionalDouble d = myBookList.stream ()
.filter (b -> b.year> = 2000).mapToDouble (b -> b.price).average ();
Stream generation with Files.lines (), mapping to number of characters per line and reduction to a total value.
long number of characters = Files.lines (Paths.get ("MyTextFile.txt"),StandardCharsets.UTF_8)
.mapToInt (String :: length).sum ();
Stream generation with Files.lines()
, generation of "sub-streams" per line with Pattern.splitAsStream()
and merging of the streams again with Stream.flatMap()
, as well as reduction to the number of elements with Stream.count()
.
long numberWords = Files.lines (Paths.get ("MyTextFile.txt"), StandardCharsets.UTF_8)
.flatMap (Pattern.compile ("[^\\p{L}]") :: splitAsStream).count ();
Exercise
- Exercise:
- Write a method that returns the average of a list of integers.
- Exercise:
- Write a method that converts all strings in a list to their upper case.
- Exercise:
- Given a list of strings, write a method that returns a list of all strings that start with the letter 'a' (lower case) and have exactly 3 letters.
- Exercise:
- Write a method that returns a comma-separated string based on a given list of integers. Each element should be preceded by the letter 'e' if the number is even, and preceded by the letter 'o' if the number is odd. For example, if the input list is (3,44), the output should be 'o3,e44'.
Solution 1:
public Double average(List<Integer> list) {
return list.stream()
.mapToInt(i -> i)
.average()
.getAsDouble();
}
Solution 2:
public List<String> upperCase(List<String> list) {
return list.stream()
.map(String::toUpperCase)
.collect(Collectors.toList());
}
public List<String> search(List<String> list) {
return list.stream()
.filter(s -> s.startsWith("a"))
.filter(s -> s.length() == 3)
.collect(Collectors.toList());
}
Solution 4: