A simple way of reading the lines of the file is in memory – both Guava and Apache Commons IO provide a fast way to do just that:
Files.readLines(new File(path), Charsets.UTF_8);
FileUtils.readLines(new File(path));
The difficulty with this method is that all the file lines are stored in memory – which will promptly lead to OutOfMemoryError if the File is large enough.
For example – reading a ~1Gb file:
@Test
public void givenUsingGuava_whenIteratingAFile_thenWorks() throws IOException {
String path = ...
Files.readLines(new File(path), Charsets.UTF_8);
}
This starts off with a small amount of memory being consumed: (~0 Mb consumed)
[main] INFO org.baeldung.java.CoreJavaIoUnitTest - Total Memory: 128 Mb
[main] INFO org.baeldung.java.CoreJavaIoUnitTest - Free Memory: 116 Mb
However, after the full file has been processed, we have at the end: (~2 Gb consumed)
[main] INFO org.baeldung.java.CoreJavaIoUnitTest - Total Memory: 2666 Mb
[main] INFO org.baeldung.java.CoreJavaIoUnitTest - Free Memory: 490 Mb
This means that about 2.1 Gb of memory are consumed by the process – the reason is simple – the lines of the file are all being stored in memory now.
To solve this error use:
jmap -dump:format=b,file=filename 6054
It should be obvious by this point that keeping in memory the contents of the file will quickly exhaust the available memory – regardless of how much that is.
What’s more, we usually don’t need all of the lines in the file in memory at once – instead, we just need to be able to iterate through each one, do some processing and throw it away. So, this is exactly what we’re going to do – iterate through the lines without holding them in memory.