Naming Paths in Code
An attempt at a proposal to come up with conventions to clearly describe different parts of file paths during software development, avoiding namespace clashes.
1 The Problem
I'm often having this issue when writing programs which shuffle files that I have to use variables to refer to files, directories, their parent directories, their absolute paths, or paths relative to another custom root directory, ... Before I know it, I end up with a large number of variables, have to pay attention to avoid name clashes, and need to think a good while before I give them a meaningful name. This often becomes the issue when I can't be bothered to organise my code correctly and split it up into a number of functions of sensible lengths. But then I would get the same problem with functions names. Could more sophisticated IDEs help there? To some extent, maybe, but how many other problems would they introduce in the process? No, I ended up deciding that there's nothing like a bit of diligence when writing code. And so I tried to come up with some guidelines to name paths in code.
2 Several Considerations
While it very much depends on the context of the software you write, there are some general considerations that you may be able to identify:
- Are you dealing with a file or directory? Some say directories are a kind of files. But for our purpose here, let's keep it simple and say that files are regular files, e.g. in the sense given by the UNIX
find
command. - Are you dealing with an input or output file/directory?
- Are you dealing with the absolute, relative path of the file/directory or only its name?
This is a glimpse of the sort of considerations you might want to keep in mind when working out what kind of names your variables should bear.
3 Variable Naming Rules
Once these considerations are clearly defined, coming up with a set of rules if fairly straightforward. Following the previously-listed considerations:
If you're dealing with a file, your variable name should comprise the word
There's certainly the problem that, depending on the language you use, some useful words might be reserved. For instance, Python reservesfile
, somewhere. Otherwise, if you're dealing with a directory, it should comprisedir
. This can easily go further, of course, withsymlink
,hardlink
,socket
,pipe
, ...file
anddir
which, more often than I'd like, clashes with the names I would gladly have christened my variables with. But using the next rules will naturally solve this by adding more particles to the names.- If you're dealing with an input file or directory, call it
infile
orindir
. Likewise for output files and directories, you see where I'm going with this. If you're dealing with an absolute path, you could use the word
root
. If you're dealing with a relative path, you could use the wordpath
. If you're dealing with just the name of a file or directory, you could use the wordname
.In particular, it appears that it's desirable to avoid leaving any room for ambiguity. For instance, instead of saying just
indir
, be explicit as to whether you're referring to theindir
's absolute path (indirroot
), theindir
's name (indirname
), or theindir
's relative path (indirpath
).In that last case, you might arguably want to altogether cast out the word
path
as, if you're like me, it might lead to confusion because it's the word you naturally use in your variable names when you don't give it much thought. In fact, why not just useinabsdir
for absolutes,inreldir
for relatives,indirname
for names? It's a bit counter-intuitive, however, that thename
particle comes last when theabs
andrel
ones come middle. Food for thought.
Of course, when you're working in a small function where only one variable will be used to refer to some kind of path, you might be debating whether troubling yourselves with this kind of rules is at all worthwhile. But then, before you know it, small functions become large ones, or you might want to move the code to a larger block, or you might want to know what the path refers to in the context where the function is called from. So maybe this sort of diligence really does apply everywhere.
4 Conclusion
To summarise the above proposal, the following particles would be in order:
- Type
file
,dir
,symlink
,hardlink
,socket
,pipe
- Purpose
in
,out
- Component
root
,path
,name
It appears that the longest variable could in this case be longouthardlinkname
. That's 17 characters, 21% of an 80-columns line, if you care about such things.
Again, this is just an example of how you could organise your path name rules. To come up with your own, it might be worth for inspiration having a look at the terminology your programming language uses. For instance, looking at the Python os.path
reference, I see that they use abspath
, basename
, dirname
, normpath
, realpath
, relpath
which must have influenced me when I came up with my own rules.