Wednesday, November 16, 2011

C# File Handling


File data needs to be processed in nearly every non-trivial program, and the classes in the base class library that you can use have lots of details. With these benchmarks and examples focused on file IO in the C# language, we evaluate file handling.
One of the most significant sources of inefficiency is unnecessary input/output (I/O).McConnell, p. 598

StreamReader/StreamWriter

Two of the most useful types for handling files in the C# language and .NET Framework are the StreamReader and StreamWriter types; they are described here. Often, you can achieve better performance with StreamReader and StreamWriter than with static File methods.
Program that uses ReadLine [C#]

using System.IO;

class Program
{
    static void Main()
    {
 // Read in every line in the file.
 using (StreamReader reader = new StreamReader("file.txt"))
 {
     string line;
     while ((line = reader.ReadLine()) != null)
     {
  // Do something with line
  string[] parts = line.Split(',');
     }
 }
    }
}

Result
    Lines in text file are read in and separated on commas.

File.ReadAllText

This simple program uses the File.ReadAllText method to load in the file "file.txt" on the C: volume. Then, it prints the contents of the file, which are now stored in a string object.
System.IO namespace [C#]

//
// Include this namespace for all the examples.
//
using System.IO;

Program that reads in file [C#]

using System;
using System.IO;

class Program
{
    static void Main()
    {
 string file = File.ReadAllText("C:\\file.txt");
 Console.WriteLine(file);
    }
}

Output

File contents.
File.ReadAllText benchmark. Here we want to resolve whether File.ReadAllText is efficient. To answer this, the ReadAllText method was benchmarked against StreamReader. The result was that on a 4 KB file it was almost 40% slower.
Program that uses ReadAllText and StreamReader [C#]

using System.IO;

class Program
{
    static void Main()
    {
 // Read in file with File class.
 string text1 = File.ReadAllText("file.txt");

 // Alternative: use custom StreamReader method.
 string text2 = FileTools.ReadFileString("file.txt");
    }
}

public static class FileTools
{
    public static string ReadFileString(string path)
    {
 // Use StreamReader to consume the entire text file.
 using (StreamReader reader = new StreamReader(path))
 {
     return reader.ReadToEnd();
 }
    }
}

Benchmark results

File.ReadAllText:         155 ms
FileTools.ReadFileString: 109 ms
StreamReader helper. In some projects, it would be worthwhile to use the above ReadFileString custom static method. In a project that opens hundreds of small files, it would save 0.1 milliseconds per file.

File.ReadAllLines

Here you want to read all the lines in from a file and place them in an array. The following code reads in each line in the file "file.txt" into an array. This is efficient code; we provide performance information later on.
Program that uses ReadAllLines [C#]

using System.IO;

class Program
{
    static void Main()
    {
 // Read in every line in specified file.
 // ... This will store all lines in an array in memory,
 // ... which you may not want or need.
 string[] lines = File.ReadAllLines("file.txt");
 foreach (string line in lines)
 {
     // Do something with line
     if (line.Length > 80)
     {
  // Example code
     }
 }
    }
}

Result
    We loop over lines in the file.
    We test the length of each line.
File.ReadAllLines performance. When you read in a file with File.ReadAllLines, many strings are allocated and put into an array in a single method call. With StreamReader, however, you can allocate each string as you pass over the file by calling ReadLine. This makes StreamReader more efficient unless you need all the file data in memory at once.

Read into List. We look at a usage of the List constructed type with file handling methods. List and ArrayList are extremely useful data structures for C# programmers, as they allow object collections to rapidly expand or shrink. Here we look at how you can use LINQ to get a List of lines from a file in one line.
Program that uses ReadAllLines with List [C#]

using System.Collections.Generic;
using System.IO;
using System.Linq;

class Program
{
    static void Main()
    {
 // Read in all lines in the file,
 // ... and then convert to a List with LINQ.
 List<string> fileLines = File.ReadAllLines("file.txt").ToList();
    }
}
Count lines. Here we need to count the number of lines in a file but don't want to write lots of code to do it. Note that the example here doesn't have ideal performance characteristics. We reference the Length property on the array returned.
Program that counts lines [C#]

using System.IO;

class Program
{
    static void Main()
    {
 // Another method of counting lines in a file.
 // ... This is not the most efficient way.
 // ... It counts empty lines.
 int lineCount = File.ReadAllLines("file.txt").Length;
    }
}
Query example. Does a line containing a specific string exist in the file? Maybe you want to see if a name or location exists in a line in the file. Here we can harness the power of LINQ to find any matching line. See also the Contains method on the List type.
Program that uses LINQ on file [C#]

using System.IO;
using System.Linq;

class Program
{
    static void Main()
    {
 // One way to see if a certain string is a line
 // ... in the specified file. Uses LINQ to count elements
 // ... (matching lines), and then sets |exists| to true
 // ... if more than 0 matches were found.
 bool exists = (from line in File.ReadAllLines("file.txt")
         where line == "Some line match"
         select line).Count() > 0;
    }
}

File.ReadLines

In constrast to File.ReadAllLines, File.ReadLines does not read in every line immediately upon calling it. Instead, it reads lines only as they are needed. It is best used in a foreach-loop.

File.WriteAllLines

Here we look at how you can write an array to a file. When you are done with your in-memory processing, you often need to write the data to disk. Fortunately, the File class offers an excellent WriteAllLines method. It receives the file path and then the array to write. This will replace all the file contents.
Program that writes array to file [C#]

using System.IO;

class Program
{
    static void Main()
    {
 // Write a string array to a file.
 string[] stringArray = new string[]
 {
     "cat",
     "dog",
     "arrow"
 };
 File.WriteAllLines("file.txt", stringArray);
    }
}

Output

cat
dog
arrow

File.WriteAllText

A very simple method to call, the File.WriteAllText method receives two arguments: the path of the output file, and the exact string contents of the text file. If you need to create a simple text file, this is an ideal method.
Program that uses File.WriteAllText [C#]

using System.IO;

class Program
{
    static void Main()
    {
 File.WriteAllText("C:\\perls.txt",
     "Dot Net Perls");
    }
}

Output
    C:\perls.txt contains the string "Dot Net Perls"

File.AppendAllText

Here we mention a way you can append text to files in a simple method. The previous example will replace the file's contents, but for a log file or error listing, we must append to the file. Note that we could read in the file, append to that in memory, and then write it out completely again, but that's slow.

File.AppendText

The File.AppendText method returns a StreamWriter instance that you can use to append string data to the specified file. It is not covered on this site in detail because it is usually easier to use the StreamWriter constructor directly.

File.ReadAllBytes

Here we use File.ReadAllBytes to read in an image, PNG, to memory. One example usage of this sample is to cache an image in memory for performance. This works very well and greatly outperforms reading in the image each time.
Program that caches binary file [C#]

static class ImageCache
{
    static byte[] _logoBytes;
    public static byte[] Logo
    {
 get
 {
     // Returns logo image bytes.
     if (_logoBytes == null)
     {
  _logoBytes = File.ReadAllBytes("Logo.png");
     }
     return _logoBytes;
 }
    }
}

No comments:

Post a Comment