Programming mkdir -p
It’s midsummer so let’s do something light and easy for a change. In UNIX there is the
mkdir -p command, meaning “create directory and all leading (parent) directories”. It is a convenient command with which you can quickly create a new directory tree. In programming, there is the
mkdir() system call. This system call creates new filesystem directories, but it does not, however, automatically create any new leading parent directories for you. An attempt to do so is an error:
ENOENT, or “No such file or directory”. How can we programmatically make deep directories?
A naive answer is to call
system("mkdir -p my/deep/path"). While not incorrect, it comparatively uses lots of resources, but even if we don’t care about that, it poses the interesting question: “Well, how does the mkdir shell command do it?”
A possible approach is to to break the path argument up into pieces, and for each piece create the new subdirectory. Something along these lines:
break up "my/deep/path" into "my", "deep", "path" mkdir "my" mkdir "my/deep" mkdir "my/deep/path"
For example, in Python:
def mkdir_p(path): arr = path.split(os.sep) newpath = '' for elem in arr: if not newpath: newpath = elem if not elem: newpath = '/' else: newpath = os.path.join(newpath, elem) if os.path.exists(newpath): continue os.mkdir(newpath)
It has some checks for correctly handling relative and absolute paths, and it gets the job done. It performs poorly however for long paths that already exist for the most part. For example, for
mkdir_p("/home/walter/src/python/blog/2013/mkdir_p") it would do seven loops already before creating a new directory. Surely we can do better.
If we could work backwards, the performance could be improved. First see if the deep path exists, take a step back, see if that path exists, and if it does, rewind and create the deeper path. This is achieved by recursion.
def mkdir_p(path): if not path: return if os.path.exists(path): return mkdir_p(os.path.dirname(path)) os.mkdir(path)
The recursive version is wonderfully terse and like often with recursive functions, it may warp your mind and take some effort to grasp.
I confess I used the non-recursive implementation for years. Ironically, Python already has a function to do this:
I had a peek in the source of Python’s
os module, and
os.makedirs() indeed uses the recursive implementation. Mind that like
os.makedirs() will throw an exception if the path already exists—unlike the
mkdir -p shell command that doesn’t complain if the directory is already there.