If you’ve used a popular tool like Word to create content and track changes, then you’d know that when you take away the tool, both the content and tracked changes become inaccessible — leaving you with a bunch of unreadable files. Whereas, Git is a versioning system, independent of content-creating tools, for tracking and managing a bunch of universally readable files (e.g., plain text) and the changes they undergo during the development of, say, a project. It is designed to work with a variety of interfacing clients — be they command-line or GUI. In other words, tracking changes uncoupled from a content-creating tool is a feature.
In 2005, Linus Torvalds took leave of absence from his work on the Linux kernel to fix his versioning handicap, and emerged with a rough cut of Git two weeks after. In his talk at Google a couple of years hence, he emphasised Git’s strength in two words: “distributed” and “branching”. (See also this TED interview.)
Now why should anyone care about tools or techniques used by software developers? For two reasons: (1) software developers use fast, super-efficient, safe and simpler tools to get their job done, because after all, simplicity is the ultimate sophistication, and (2) software developers have unmatched discipline in managing changes in their vast code bases, which makes them uniquely gifted in managing and tracking content and projects at large. Git can be used for efficiently versioning and tracking any content in plain text, be they reports, papers, contracts, or computer models. So learning from the best, and using their tools in one’s line of work would offer similar benefits.
Mechanics of Git
Git is a bunch of command-line tools to perform various versioning operations, e.g., like preparing a repository of files and folders for adding to a snapshot, and then taking a snapshot, running diffs, checking the history of commits, reviewing the artefacts for changes, etc. Here’s one explanation by Atlassian:
Git is not fooled by the names of the files when determining what the storage and version history of the file tree should be, instead, Git focuses on the file content itself. After all, source code files are frequently renamed, split, and rearranged. The object format of Git’s repository files uses a combination of delta encoding (storing content differences), compression and explicitly stores directory contents and version metadata objects.
How to manage versioning of computer models
For novice users, a GUI client is perhaps the best way to begin without the cognitive load of the command-line. There exist numerous clients one can choose from. I like Sublime Merge as my primary GUI client. I think GitHub Desktop is nice and helpful to use too.
Let’s say you have a bunch of model files ending with
.inp file extension in a folder called
jacket. Let us also assume that these are master copies. In GitHub Desktop,
- Select Add an Existing Repository… and choose
- Click on the link create a repository.
- GitHub Desktop adds all files existing in the
jacketfolder and commits as the “Initial commit”, and calls the branch
main. This can be your master copy of computer model files for, say, in-place analysis.
- To maintain sanctity of the
mainbranch, but use it to create a model for, say, lifting analysis, create a branch called
lifting. When created, Git makes a copy of main branch files, which can then be edited to suit lifting analysis.
- Once these files are prepared for lifting, one can commit files. The degree of granularity is left to the engineer to decide, though it’s good practice to keep committing in git for changes made to model files — both minor and major — to maintain good traceability.
- In a slightly advanced mode, an upstream repository can be setup — like on a local network server, Raspberry Pi, or alternatively use service sites like GitHub — to and from which engineers in a team can pull and push changes to model files. In Git, every repository is independent and fully self-contained with all history, and is capable of communicating with the network, which makes it truly distributed, and eliminates single points of failure.
Here is another example: if you are writing a contract, and you have a master copy to work from, then you can create a branch, give it a case (or client) name, and start working on the newly created branch, and keep that tree of changes independent of the master copy. The use cases for managing content and tracking changes in Git are limitless.
Software developers tend to work on branches and eventually merge back into the main branch. But if the purpose of a branch is to be independent of the master copy, like in the case of computer models or contract text, then they may not need to be merged.
All versioning is stored in
.git folder within the root of the repository folder.1 By carrying the repository folder across (from, say, a computer, network, media drive, or whatever), you’d carry its entire history of changes too — that’s compact and portable. Every commit contains full file sets (chosen and committed) with changes made at that commit, and it is therefore easy to revert or recover files from future changes in a given tree.
In summary, versioning via Git is a smart, clean way than, say, creating named folders to differentiate between versions or changes within files.
Note that in UNIX-like systems, files and folders beginning with a dot are typically hidden. ↩