Java - Reading a Large File Efficiently

Java - Reading a Large File Efficiently


Posted in : Java Posted on : July 6, 2014 at 10:06 PM Comments : [ 0 ]

In this tutorial you will learn about the Java code for reading a large file efficiently and without loading the whole file in memory.

Reading a large file efficiently is now easy task if we use the correct Java API

In this tutorial we are discussing about the Java programs for reading a large file without causing any memory or performance issues. There are two types of files text and binary files. If you have to read a text file you should use the BufferedReader class and if the file is binary then BufferedInputStream class should be used.

If you have a small file say 3-4MB then you can read the complete file into memory and then process the data. But the file size is not in GBs then you can't read the complete file in memory as there is memory restriction and it also takes the CPU resource. If you read the content of file in memory, the available system memory will quickly exhaust as there is limitation of hardware and software. In most of the cases we don't need the complete data of a file at once, so we should read the file line by line or some data range and process the data. Once the certain data is processed we can load next set of data. 

So, the ideal way is to read the file one line at a time and process the data efficiently. This makes your program efficient and also consumes less memory and CPU power.

How to read a large binary file?

One to read the binary file is to use the class FileInputStream class.

 Java - Reading a Large File Efficiently

The FileInputStream is used to get the input bytes from a file means it is used to read the bytes from file. This class is used to read the streams of data such as image data/binary data from the input stream. In our example we will use the fileInputStream.read() method to read the byte from stream.

Following example shows how to read a binary file and copy the data into another file:

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;

public class ReadAndWriteBinaryFileExample1 {


    public static void main(String[] args) throws IOException {

        FileInputStream fileInputStream = null;
        FileOutputStream fileOutputStream = null;

        try 
        {
            //Open the input and out files for the streams
            fileInputStream = new FileInputStream("test.jpg");
            fileOutputStream = new FileOutputStream("test_copy.jpg");
            int data;

            //Read each byte and write it to the output file
            //value of -1 means end of file
            while ((data = fileInputStream.read()) != -1) {
                fileOutputStream.write(data);
            }
        }
        catch (IOException e)
        {
            //Display or throw the error
            System.out.println("Eorr while execting the program: " + e.getMessage());
        }    
         finally 
        {
            //Close the resources correctly
            if (fileInputStream != null)
            {
                fileInputStream.close();
            }
            if (fileInputStream != null)
            {
                fileOutputStream.close();
            }
        }
        
    }

}

If you run the above program the data from test.jpg will be copied to the test_copy.jpg file. But the over all performance of the program is not good as it is just reading one byte at a time and then writing to the output stream. The performance of the program can be increased if BufferedInputStream is used.

The Java.io.BufferedInputStream class is used for buffered reading of data through another stream. It enhances the performance of the applications.

Following example is the modified version of the above program using the BufferedInputStream class which provides the high performance.

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.File;
import java.io.BufferedInputStream;
import java.io.InputStream;

public class ReadAndWriteBinaryFileExample2 {


    public static void main(String[] args) throws IOException {

        InputStream fileInputStream = null;
        FileOutputStream fileOutputStream = null;

        try 
        {
		
	//find the file size
	File fileHandle = new File("test.jpg");     
	long length = fileHandle.length();

            //Open the input and out files for the streams
            fileInputStream = new BufferedInputStream(new FileInputStream("test.jpg"));
            fileOutputStream = new FileOutputStream("test_copy.jpg");
            int data;

            //Read data into buffer and then write to the output file
			byte[] buffer = new byte[1024];                    
            int bytesRead;                                    
            while ((bytesRead = fileInputStream.read(buffer)) != -1)     
            {     
                fileOutputStream.write(buffer, 0, bytesRead);           
            }   

        }
        catch (IOException e)
        {
            //Display or throw the error
            System.out.println("Eorr while execting the program: " + e.getMessage());
        }    
         finally 
        {
            //Close the resources correctly
            if (fileInputStream != null)
            {
                fileInputStream.close();
            }
            if (fileInputStream != null)
            {
                fileOutputStream.close();
            }
        }
        
    }

}

If you run the above program it will copy the data from test.jpg to test_copy.jpg in very less time. So, if you are looking to develop the fast application use the BufferedInputStream class to read the binary data.

In Java 7 and above Try-with-resources functionality is added. Now we modify the above program (BufferedInputStream class example) to use the Try-with-resources feature of latest JDK, which make the programming much easier.

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.File;
import java.io.BufferedInputStream;
import java.io.InputStream;

public class ReadAndWriteBinaryFileExample3 {


    public static void main(String[] args) throws IOException {

	//find the file size
	File fileHandle = new File("test.jpg");     
	long length = fileHandle.length();


	try (

		//Open the input and out files for the streams
		InputStream fileInputStream = new BufferedInputStream(new FileInputStream("test.jpg"));
		FileOutputStream fileOutputStream = new FileOutputStream("test_copy.jpg");
	)  
	{
		int data;
		//Read data into buffer and then write to the output file
		byte[] buffer = new byte[1024];                    
		int bytesRead;                                    
		while ((bytesRead = fileInputStream.read(buffer)) != -1)     
		{     
			fileOutputStream.write(buffer, 0, bytesRead);           
		}  

	}//try-with-resource    
    }

}

The advantage of using the try-with-resources is that it automatically closes the opened streams with the execution of code block ends. So, this a good news for the programmers as the resources are automatically handled in a better way.

Reading text files

Now we will discuss about reading the large text file efficiently.

To efficiently reading the large text we can use the same trick of reading one line of data, process the data and then loading the next set of data until end of file is reached. For this purpose mostly BufferedReader class and Scanner Class is used.

Check the tutorial: How to read a file in Java efficiently using BufferedReader and Scanner class?

Here is an example of Reading large file using BufferedReader class

import java.io.BufferedReader;
import java.io.FileInputStream;
import java.io.InputStream;
import java.io.InputStreamReader;

public class BufferedReaderExample1 {
   public static void main(String[] args) throws Exception {
      
      try{
         //Open input stream for reading the text file MyTextFile.txt
         InputStream is = new FileInputStream("data.txt");
         
         // create new input stream reader
         InputStreamReader instrm = new InputStreamReader(is);
         
         // Create the object of BufferedReader object
         BufferedReader br = new BufferedReader(instrm);
      
         String strLine;
         
         // Read one line at a time 
         while((strLine = br.readLine()) != null)
         {    
            // Print line
            System.out.println(strLine);
         }
         
      }catch(Exception e){
         e.printStackTrace();
      }
   }
}

Now we will update the above code to use the Java 7 try-with-resources feature:

import java.io.BufferedReader;
import java.io.FileInputStream;
import java.io.InputStream;
import java.io.InputStreamReader;

public class BufferedReaderExample2 {
   public static void main(String[] args) throws Exception {
      
      try(
		//Open input stream for reading the text file MyTextFile.txt
		 InputStream is = new FileInputStream("data.txt");
		 
		 // create new input stream reader
		 InputStreamReader instrm = new InputStreamReader(is);
		 
		 // Create the object of BufferedReader object
		 BufferedReader br = new BufferedReader(instrm);		  
	  
	  ){
         String strLine; 
         // Read one line at a time 
         while((strLine = br.readLine()) != null)
         {    
            // Print line
            System.out.println(strLine);
         }
         
      }
   }
}

So, you have learned how to read the large text file using the BufferedReader class. Now I will explain you how to use the java.util.Scanner class to read the large text file.

Following is the example of read file in Java using Scanner Class:

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;


public class ScannerFileReadExample1 {

    public static void main(String args[]) throws FileNotFoundException {
  
        //Create the file object
        File fileObj = new File("data.txt");
      
        //Scanner object for reading the file
        Scanner scnr = new Scanner(fileObj);
      
        //Reading each line of file using Scanner class
        while(scnr.hasNextLine()){
            String strLine = scnr.nextLine();
			//print data on console
            System.out.println(strLine);
        }       
    
    }   
  
}

Now we will update the above code to use the Java 7 try-with-resources feature:

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;


public class ScannerFileReadExample2 {

    public static void main(String args[]) throws FileNotFoundException {

          //Create the file object
        File fileObj = new File("data.txt");
		try(
     
        //Scanner object for reading the file
        Scanner scnr = new Scanner(fileObj);
		)
		{
        //Reading each line of file using Scanner class
        while(scnr.hasNextLine()){
            String strLine = scnr.nextLine();
			//print data on console
            System.out.println(strLine);
        }  
	  }
    } 
}

After running the example(s) of reading the text file following output is displayed:

Reading a Large text File Efficiently - Line 1
Reading a Large text File Efficiently - Line 2
Reading a Large text File Efficiently - Line 3
Reading a Large text File Efficiently - Line 4
Reading a Large text File Efficiently - Line 5
Reading a Large text File Efficiently - Line 6
Reading a Large text File Efficiently - Line 7
Reading a Large text File Efficiently - Line 8
Reading a Large text File Efficiently - Line 9
Reading a Large text File Efficiently - Line 10

In this section you have learned the different ways of reading a large file (binary and text) efficiently.

Go to Topic «PreviousHomeNext»

Your Comment:


Your Name (*) :
Your Email :
Subject (*):
Your Comment (*):
  Reload Image
 
 

 
Tutorial Topics