IntelliJ IDEA Java

Easy Hacks: How to Use Java Streams for Working With Data

In March 2014, the Ice Bucket Challenge helped the internet come together to fight ALS. Disney’s Frozen taught a generation of children to Let It Go, and the community saw the release of Java 8 LTS to much fanfare. In fact, as per the JetBrains Developer Ecosystem survey, Java 8 still holds the top spot at 50% usage over all other Java versions. One of the major features introduced in Java 8 was the Streams API.

As we dive deeper into the topic, you’ll learn what Java Streams are and how to work with them. You’ll also get some practical examples of commonly used methods.

Note: For this post, you’ll need at least Java 16 or higher so that we can use some of the newer language features with Streams. Streams haven’t changed much since the Java 8 release, but the feature has been updated to include a few quality-of-life improvements.

What are Java Streams?

Data processing is an essential part of any programming language. While Java has always had control flow structures like if, else, switch, and other loops, these approaches are imperative. Imperative code is explicit about how an application executes steps and in what order they should be done. The imperative code style can also end up being more verbose and mask the developer’s intent. Streams aim to solve some of these issues with a new approach to data processing that opens up opportunities for parallel processing.

According to Oracle’s documentation, a Stream<> is a sequence of elements supporting sequential and parallel aggregate operations. In other words, when working with a data set, you can apply set-based methods to transform the collection into a new final result. Streams are a declarative approach to data processing, focusing on clear intent communicated through a chain of method calls.

Let’s look at a typical example of finding all the even values in a List<Integer>.

var numbers = Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
var evenNumbers = new ArrayList<Integer>();

for (Integer number : numbers) {
    if (number % 2 == 0) {
        evenNumbers.add(number);
    }
}

The code sample uses control flow structures with for and if statements. There’s nothing wrong with this approach, but we can do better with streams. Let’s see what that looks like.

var numbers = Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
var evenNumbers = numbers.stream()
        .filter(number -> number % 2 == 0)
        .toList();

Wow! We’ve reduced the implementation from six lines to just one (ignoring formatting). In this example, we use the filter method to apply a predicate condition (a statement that results in a boolean) to determine when an item in the collection matches our evenness criteria. 

Another element you don’t see in a stream statement is any looping operation. The iteration is an implementation detail; thus, steps in the stream may be processed sequentially or in parallel if you invoke the parallel or parallelStream methods. Since this post is aimed at beginners, we won’t dive into parallelization, but be aware that these options exist.

If we’re honest with ourselves, we typically just want the correct result to be right no matter how we get it. You can think of stream statements as setting up a pipeline, where each step represents the individual tasks to be performed. That’s why they’re called streams.

Now that you understand what streams are, let’s review some sample exercises to see how you may use stream statements to simplify everyday data processing tasks. But first, let’s take a quick detour and look at the Optional type, as it’s a critical part of the Stream API.

Java’s Optional type

The declarative style of streams can create uncertainty with results. Let’s look at another example stream that executes to no result, and everything will become clear.

var names = Arrays.asList("Maarten", "Marit", "Mala");
var khalid = names.stream()
        .filter(n -> n.equals("Khalid"))
        .findFirst();

While the stream statement is valid and will execute, we’re not guaranteed a result from our statement. In cases where a single object or null would be returned, an Optional<> is returned instead. What is this type, and why is it important?

The Oracle documentation describes Optional<> as a container object that may or may not contain a non-null value. Helper methods such as isPresent and isEmpty help determine if and when a value exists, with both returning true or false for their respective question. You can also use the get method to return an instance’s non-null value, but you risk triggering a NoSuchElementException if the preceding checks are not used. You may also use the orElse method as a more straightforward approach to retrieving the non-null value or a default instance. Let’s see these methods in action.

Optional<String> name = Optional.empty();

name.isEmpty(); // true
name.isPresent(); // false
name.orElse("Khalid"); // "Khalid"
name.get(); // 💥 Boom!

The Optional type allows Java to execute your declarative stream statements, knowing it can give you a result, even though there could be no result at all, which brings us to the lesson of this section. In imperatively written code, we use if statements to be defensive and avoid app-breaking exceptions. With streams, you can use the Optional type to be as defensive as before but with a declarative approach and a fabulous set of helper methods. Bugs are nasty, and you can never be too careful.

Let’s get to the samples now.

Simple examples of Java Streams

To get started, we’ll need a data model for our streams. Let’s keep it simple with a Fruit collection that contains a name, category, and price. The starting code will give us a few ways to query and transform the data set using streams.

static List<Fruit> fruits = Arrays.asList(
        new Fruit("Avocado", FruitCategory.NonSweet, 10.0),
        new Fruit("Cucumbers", FruitCategory.NonSweet, 1.0),
        new Fruit("Apple", FruitCategory.SubAcid, 2.0),
        new Fruit("Mango", FruitCategory.SubAcid, 5.0),
        new Fruit("Peach", FruitCategory.SubAcid, 7.0),
        new Fruit("Banana", FruitCategory.Sweet, 2.0),
        new Fruit("Fig", FruitCategory.Sweet, 8.0),
        new Fruit("Papaya", FruitCategory.Sweet, 12.0),
        new Fruit("Grapefruit", FruitCategory.Acid, 7.0),
        new Fruit("Kiwi", FruitCategory.Acid, 12.0),
        new Fruit("Pineapple", FruitCategory.Acid, 14.0),
        new Fruit("Tomato", FruitCategory.Nightshades, 1.0),
        new Fruit("Eggplant", FruitCategory.Nightshades, 2.0),
        new Fruit("Habanero", FruitCategory.Nightshades, 1.0),
        new Fruit("Jalapeno", FruitCategory.Nightshades, 1.0)
);

record Fruit(String name, FruitCategory category, double price) {
}

enum FruitCategory {
    NonSweet,
    SubAcid,
    Sweet,
    Acid,
    Nightshades
}

Let’s start with straightforward mapping queries. A map is a step in a pipeline that may transform our initial data into something entirely different. Let’s look at a few map examples based on our Fruit collection.

Our first statement is to get only the names of our fruits from the existing collection. We use a method reference to Fruit::name to reduce the complexity of our statement, which is logically equivalent to the lambda expression of fruit -> fruit.name().

var onlyNames = fruits.stream()
        .map(Fruit::name)
        .toList();

Now, let’s retrieve only the prices of our fruits.

var onlyPrices = fruits.stream()
        .map(Fruit::price)
        .toList();

What if we wanted to apply a 20% discount to all the prices?

var discountedFruits = fruits.stream()
        .map(f -> new Fruit(f.name, f.category, f.price * 0.8))
        .toList();

In this case, we don’t want to mutate the original instances, but we want a new collection at discounted prices. This technique is typically referred to as “projection” and can represent the process of transforming existing types into different instances of the same type or some other type.

The skip, limit, and sorted methods

Data doesn’t always come in the order we want, but streams help us restructure collections based on their values. If you’ve ever had to write imperative code to order elements in a collection, you’ll likely rejoice at these alternatives.

The first method we’ll look at is skip, which allows you to iterate past a specified number of elements.

var skipFive = fruits.stream().skip(5).toList();

The limit method allows you to retrieve a set number of elements.

var onlyFive = fruits.stream().limit(5).toList();

The sorted method allows you to use a Comparator to reorder elements by values. Let’s reorder our Fruit collection from highest to lowest prices. A comparison operates on the values 1, 0, and -1, which are used to reorder the collection. You’ll need the following imports in your Java file to use Java’s built-in Comparator implementations.

import static java.util.Comparator.comparing;
import static java.util.Comparator.reverseOrder;

This will allow you to use the static methods of comparing and reverseOrder, which you can apply to order the values from highest to lowest.

var highestToLowest = fruits.stream()
        .sorted(comparing(Fruit::price, reverseOrder()))
        .toList();

These methods can be combined to page through any collection and are especially useful in large datasets.

int pageSize = 5;
int currentPage = 1;

var highestToLowest = fruits.stream()
    .sorted(comparing(Fruit::price, reverseOrder()))
        .skip((currentPage - 1) * pageSize)
        .limit(pageSize)
        .toList();

As you can see, the strength of streams comes from chaining these methods together to produce a concise declarative pipeline that is easy to read, understand, and maintain.

Terminal operations

Whether streams are short or complex, you’ll eventually want to execute the pipeline and possibly return a result.

When dealing with streams, you’ll have a choice of methods known as terminal operations. These methods include toList, toArray, and forEach, to name but a few. When invoked, these methods execute the stream up to the point of the call, and any additional stream operations must be appended to a new stream.

List<Double> priceList = fruits.stream()
        .map(Fruit::price)
        .toList(); // terminate stream

// a new stream from List<Double>        
double total = priceList.stream() 
        .mapToDouble(Double::valueOf) 
        .sum(); // terminate new stream

As homework, check out IntStream, LongStream, and DoubleStream as specific stream implementations with type-specific methods.

Terminating streams might be necessary in some circumstances, as a stream may only be operated upon and is closed after the initial termination. This mistake could lead to an IllegalStateException. In the following example, you’ll see an example of calling toList multiple times on the same stream.

var stream = fruits.stream()
        .map(Fruit::name);

var one = stream.toList();
var two = stream.toList(); // Error: IllegalStateException

Now that you understand terminating operations, let’s move on to the next section for more examples.

Java Stream Queries

When dealing with a collection, in our case, a collection of Fruit instances, we may want to find specific instances or know if any elements match particular criteria. Streams support multiple find methods such as findFirst, findAny, and match methods of anyMatch, allMatch, and noneMatch. These methods also offer some efficiency as they short circuit and return a result.

Let’s say we need to find the first fruit priced over 10.0 in the Acid category.

var expensiveAcid = fruits.stream()
        .filter(f -> f.price > 10.0 && f.category == FruitCategory.Acid)
        .findFirst();

We can use the method anyMatch to determine if any elements exist that match a predicate, with a boolean informing us of the result.

var hasExpensiveAcid = fruits.stream()
        .anyMatch(f -> f.price > 10.0 && f.category == FruitCategory.Acid);

You may use the methods of allMatch to see if all elements match a particular criteria.

var nothingIsFree = fruits.stream()
        .allMatch(f -> f.price > 0.0);

Reducing values in Java Streams

You may have heard the phrase “map/reduce” used in the development industry, and in general, it’s the combination of those two actions to produce a singular result. These results use aggregating functions such as average, sum, count, min, and max. These values typically produce metadata about the collection as a whole and can help summarize the total state of a collection.

The count method is the most common reduction people will use, likely without realizing it. If a stream is created from a collection, the count will just return the collection size. However, it will be more useful if you have intermediate operations, like filter. In most cases, you’ll want to use a known collection type’s size method, and count can be helpful when trying to understand the makeup of a collection. In the case of our dataset, we may want to find out how many acidic fruits we offer in a given collection.

var acidicTotalCount = fruits.stream()
        .filter(f -> f.category == FruitCategory.Acid)
        .count();

A typical reduction would be to try and find the dataset’s average. Looking at the Fruit collection, we can ask “What is the average price of our inventory?” Here, we’ll need to get all available price values and divide them by the total number of items. Streams have a helpful average method that can take care of the math for us.

var averagePrice = fruits.stream()
        .mapToDouble(Fruit::price)
        .average();

Similar to average, the sum method helps calculate the total based on a particular field.

var sum = fruits.stream()
    .mapToDouble(Fruit::price)
    .sum();

The min and max methods can also be used to find the lowest and highest values, respectively.

var min = fruits.stream()
    .mapToDouble(Fruit::price)
    .min();

var max = fruits.stream()
    .mapToDouble(Fruit::price)
    .max();

IntelliJ IDEA Streams tips and tricks 

As you may have guessed, IntelliJ IDEA offers a few quality-of-life features that make your experience working with streams much more enjoyable.

IntelliJ IDEA can detect common issues around Stream usage, particularly if a stream has already been consumed. As mentioned, attempting to consume a stream multiple times will lead to an IllegalStateException runtime exception. Avoid the exception by paying attention to the in-editor warnings.

We’ve been showing some straightforward stream examples, but they can get increasingly complex, especially as each chained method can mutate results and their types. In IntelliJ IDEA, type hints offer an overview of each step in a stream statement.

If you’re still having doubts about how a stream is executing, you can use the stream trace features in IntelliJ IDEA to step through each chained step.

Tip: Explore custom type renderers to visualize essential elements of type instances.

Finally, if you find streams more confusing than their imperative counterparts, you can use a quick-fix to replace any stream usage with its loop version. You can also switch back and forth to get a better personal understanding of the code’s intention.

Conclusion

Streams are a powerful feature that was introduced in Java 8, and while it’s been almost a decade since the feature’s release, streams is still an often underappreciated part of the Java ecosystem. Whether you’re new to Java or an experienced developer, we hope this post gives you some new ideas for experimenting with the Streams API.

Additionally, IntelliJ IDEA offers excellent tools and quick-fixes to help you during the learning process. Feel free to contact our great Java developer advocates and the JetBrains community to share your thoughts.

As always, we’d love to hear your thoughts in the comments section.

image description