How to Manage Temporary Files

January 9, 2013 Steve Hawley

If you’ve written software to manipulate large chunks of data, you’ve likely created temporary files to hold the data for you. And if you’re like me, your machine currently has 2587 files in your personal Temp folder. Why? Chances are there are a ton of apps on your system that are Doing It Wrong. I’m going to talk about how most apps do this, how it goes horribly wrong, and what you can do to mitigate this problem for your own code.

Most apps will get the path to the Temp folder and run a loop creating files based on time stamp or time stamp followed by iteration until they succeed in making a new file. This path (or stream) gets used in the application in some manner and at some point in the future it will get removed. And you can bet that it’s the last step where applications screw up – they don’t (or can’t) clean up after themselves.

If you’re using Path.GetTempFileName(), stop. Seriously. Just stop using it. It’s not appropriate. It tries to create a file in the form tmpXXXX.tmp, where XXXX is a 4 digit hex number. This is not appropriate for a couple reasons. First, in my own folder, there are currently 2516 files that match that pattern. In other words, 2516/65536 or 3.8% of the files are already in use and I swear I cleaned out that folder a couple months ago. In short order, it will fill up. Further, every nth call will take longer, as the first n-1 tries to make the file will fail. On top of that, there is no infrastructure to take care of the temporary file when you’re done with it. The developer is responsible for removing it and we know how well developers remember to do things.

This is a resource problem: a limited number of slots and responsibility to get and release the resource is in the hands of the developer. When you approach this as a resource problem, it becomes a whole lot harder to screw up.

So let’s set a few goals for this task. This code should be:

  • Easy to use
  • Hard to screw up
  • Customizable
  • Better than Path.GetTemporaryFile()

So let’s start with a TemporaryFile object. This is what is going to hold your information about the temporary file. There are a number of ways this could be structured to be more flexible (like a template base class wherein subclasses could be any Stream instead of a FileStream), but it’s called TemporaryFile not TemporaryStream, so let’s keep it simple. Here is TemporaryFile:

 
    public class TemporaryFile : IDisposable
    {
        public TemporaryFile(string path)
        {
            if (path == null)
                throw new ArgumentNullException("path");
            Path = path;
            TemporaryStream = new FileStream(path, FileMode.Create);
        }
 
        public Stream TemporaryStream { get; private set; }
        public string Path { get; private set; }
        public override string ToString()
        {
            return Path;
        }
 
        private bool _disposed = false;
        public void Dispose()
        {
            Dispose(true);
            GC.SuppressFinalize(this);
        }
 
        protected virtual void Dispose(bool disposing)
        {
            if (!_disposed)
            {
                if (disposing)
                {
                    TemporaryStream.Dispose();
 
                    File.Delete(Path);
                }
                _disposed = true;
            }
        }
 
        ~TemporaryFile()
        {
            Dispose(false);
        }
    }
 

In this case, I chose to make the class itself immutable and Disposable. As it stands you could use this class as a wrapper around the result from Path.GetTemporaryFile(), but we’ll do better than that. Remember that TemporaryFile is a resource and if that’s the case, it should be IDisposable. We want to do something very specific when the file is closed.  So following the pattern in IDisposable Made E-Z, I’ve made this class do a few things: open the file on construction and on disposal dispose the TemporaryStream and delete the path. According to the spec for Stream.Close(), it is not necessary to call Close() but to instead ensure that Stream.Dispose() gets called. If either Stream.Dispose() or File.Delete() throw an exception, you probably have worse problems on your hands. If you wanted you could add an implementation of Equals and GetHashCode. I chose not to since pointer equality should be good enough.

Where possible, TemporaryFile objects should be scoped within a using() block to ensure they get disposed in a timely manner, otherwise you have to pray that the GC will help you out at the end of app life.

We’re not done yet – as mentioned before, Path.GetTempFileName() gets worse the more you use it. We can fix that too. Also the filenames it generates are not so good in that we can’t fingerprint them as our own so that we can check at a glance that we’re leaking temp files or better yet, write a unit test to take care of them. So how do you make a TemporaryFile? Use a factory:

 
    public class TemporaryFileFactory
    {
        protected static string kDefaultPrefix = "Tmp";
        protected static string kDefaultSuffix = "";
        protected static string kDefaultExtension = "tmp";
        protected static int kDefaultRetries = 10;
 
        public static TemporaryFile MakeTemporaryFile(string extension) { return new TemporaryFileFactory(null, null, null, extension, kDefaultRetries).MakeTemporaryFile(); }
 
        public TemporaryFileFactory(string tempFolder, string prefix, string suffix, string extension, int retries)
        {
            if (retries <= 0) throw new ArgumentOutOfRangeException("retries");
            Retries = retries;
            TempFolder = String.IsNullOrEmpty(tempFolder) ? Path.GetTempPath() : tempFolder;
            if (!Directory.Exists(TempFolder))
                throw new IOException(String.Format("Temporary folder '{0}' does not exist", tempFolder));
            Prefix = String.IsNullOrEmpty(prefix) ? kDefaultPrefix : prefix;
            Suffix = suffix ?? kDefaultSuffix;
            Extension = SanitizeExtension(extension ?? kDefaultExtension);
        }
 
        private string SanitizeExtension(string ext)
        {
            if (ext.EndsWith("."))
                ext = ext.Substring(0, ext.Length - 1);
            if (ext.StartsWith("."))
                ext = ext.Substring(1);
            if (ext.Length == 0)
                ext = kDefaultExtension;
            return ext;
        }
 
        public string TempFolder { get; private set; }
        public string Prefix { get; private set; }
        public string Suffix { get; private set; }
        public string Extension { get; private set; }
        public int Retries { get; private set; }
 
        public TemporaryFile MakeTemporaryFile()
        {
            for (int i = 0; i < Retries; i++)
            {
                string path = GenerateTemporaryPath(i);
                if (!File.Exists(path))
                    return new TemporaryFile(path);
            }
            throw new IOException(String.Format("Unable to create temporary file in {0} after {1} {2}.", TempFolder, Retries, (Retries == 1 ? "try" : "tries")));
        }
 
        protected const string kDefaultFormat = "{0}{1}{2}.{3}";
        protected virtual string GenerateTemporaryPath(int retry)
        {
            Guid g = Guid.NewGuid();
            string newPath = Path.Combine(TempFolder, String.Format(kDefaultFormat, Prefix, g.ToString("N"), Suffix, Extension));
            return newPath;
        }
    }
 

In this case, we’re making the factory easy to use if you don’t care about the file name format and easy to customize if you do. The first thing you should know is that I’m defining the format of a temporary file name this way – <Prefix><content><Suffix>.<Extension>. The defaults are such that you should get Tmp<content>.tmp for your temporary file. There is a main constructor that takes values for Prefix and Suffix and Extension and defaults them if they are null (or in some cases null or empty). The constructor also takes a path to the folder you want to use for temporary files, but you can also set this to null and it will use Path.GetTempPath() instead. The question is, what should go in the <content> part of the file? Why not use a GUID? Just from the definition, I’m fairly certain that I only have to check once to see if the path will exist since the GUID should be unique by definition. Still, GenerateTemporaryPath() was made virtual so you could override if this doesn’t suit your particular needs. GenerateTemporaryPath() includes a retry number, so you could reimplement Path.GetTempFileName if you wanted to (but why?). There’s some basic error checking/patching to mitigate pilot error in SanitizeExtension() and finally a nice static factory method that news up a factory for you with defaults and generates a TemporaryFile for you. Here’s my test code:

 
        static void Main(string[] args)
        {
            List<string> files = new List<string>();
 
            try
            {
                for (int i = 0; i < 20; i++)
                {
                    using (TemporaryFile tf = TemporaryFileFactory.MakeTemporaryFile(null))
                    {
                        files.Add(tf.Path);
                        Console.WriteLine("Created " + tf.Path);
                        if (i > 18)
                            throw new Exception("no purpose");
                    }
                }
            }
            catch { }
            finally
            {
                foreach (string path in files)
                {
                    if (File.Exists(path))
                        Console.WriteLine("error - file " + path + " still exist.");
                }
            }
        }
 

which demonstrates how to make TemporaryFiles using the static factory method within a using and throws an exception for no other reason than to demonstrate the IDisposable does what it says on the box.

From an architecture point of view, this is good enough. It could be slightly cleaner and if I were to make any changes at all, I might put the static factory method(s) inside of TemporaryFile instead of TemporaryFileFactory, then I would call the methods Create(). If I wanted to hide the implementation details of TemporaryFile, I would probably make an interface ITemporaryStream and only put the TemporaryStream property in it. Then if you wanted to use a MemoryStream based factory you could. I think this is poor decision for a few reasons:

  • Temporary files are well established in the art
  • MemoryStreams (or other streams) don’t have the same need for a file deletion as a FileStream and fall outside of our problem domain
  • It would necessitate either a factory for specific file streams
  • Existing code isn’t always .NET Stream compatible (for example, a C library that needs a path)

So you see that by looking at the task as a resource problem rather than a temporary file problem, we can make a solution that is as easy to use and more resilient.

About the Author

Steve Hawley

Steve was with Atalasoft from 2005 until 2015. He was responsible for the architecture and development of DotImage, and one of the masterminds behind Bacon Day. Steve has over 20 years of experience with companies like Bell Communications Research, Adobe Systems, Newfire, Presto Technologies.

Follow on Twitter More Content by Steve Hawley
Previous Article
Atalasoft 10.3 Released - Major Version Includes Mobile Annotations
Atalasoft 10.3 Released - Major Version Includes Mobile Annotations

Hello! This time, we have lots of exciting updates across our entire...

Next Article
What Does It Take To Be A Software Developer?
What Does It Take To Be A Software Developer?

I saw this question on Metafilter about what CS courses are necessary to...

Try any of our Imaging SDKs free for 30 days with Full Support

Download Now