In the previous article, we explored why Git is essential in modern development. Today, we are going to learn about the working of VCS and, more specifically, Git Internals.

Open your favourite terminal as this article is going to be full of praticals.

To understand the working of Git, we will create a project named git-internal using the command given below.

mkdir git-internal

Then we will go inside the project folder using the cd git-internal command

Once the project folder is set up, we are ready to understand the internal working of git. Before that, we are going to build the mental model for the problem of code tracking, resonate with its solution, and then we will go into the jargon of GIT.

Let's suppose we are building a product that tells us the weather information about the city we asked for. For that, we definitely need a code in a file that we can run to get our desired output. And for this article, we are going to use JavaScript as the programming language to build this software.

So, we are going to create a file named index.js vim index.js I am using Vim as the terminal editor; feel free to use any code editor for this project. And as soon as we create the index file, I guess we should track it that some file has been created in the git-internal folder, right? But how? Let’s leave it for now and write a simple JavaScript code, then run the code using the command `node index.js`.

We should see the simple output in the terminal saying “Weather Report”.

But wait… a minute, where is the code to find the weather report of a given city? Don’t worry, at the end of the article, you will have one with you.

Till now, what has happened?

We created a file named index.js
We have written a simple console log code in that file

Simple right?

We should have tracked these 2 things and such other activities inside the folder git-internal . So, should we store these changes/activities in the database or somewhere but where?

Let’s suppose we store this in a database, at 03sec, 6:59 pm, 31st Dec 2025, a file is created inside git-internal a folder. At 30sec, 6:59 pm, 31st Dec 2025, some code is added. We are tracking activities, which include file creation and any changes within the file, such as adding, deleting, or updating characters (spaces, new lines, or new characters). But what about codes? How are we going to track at what time exactly what code changes? In our case, we are tracking every change in any character, right?

So, it can be a very comput-intensive task, and also maintaining the history of code changes is also very difficult, right? And storing, reading, updating, and deleting activities can trigger way more database calls, which can break the database. And yes, there are a lot of optimised databases available for high-frequency operations, but it doesn’t mean that if something exists, we can use it anywhere. Using those databases can be expensive and also not make any sense in this.

So what is the solution then?….

Fine, no problem, we can’t perform high-frequency operations with the database for just tracking the code changes, but what if we track operations/changes in a block? I mean, we make some changes, like create new files, add a bunch of code/features, update the existing code for improvement, and other things we want to do. Then once we are done with our changes, we can store this update in our database at once, right?

Fine, we got a solution, but still we have to make queries to databases via api calls. And if our network is not stable or for some reason we are not able to connect to the database, then what? Though we have solved the problem of tracking a code. Which is create new project → make changes → track changes → store it into db → update the code → track changes → store it into db → and continue in the loop. But what if the database is not available?

One thing we can do is to store all the tracked files, updates, and changes in a file or folder inside the same project root folder. Can we do that?

Why not? If we can store these things in a database, we can also store them in files. Awesome, so we have almost solved the problems with the first principal and got the way git operates internally.

BUT WAIT!!!!….

So far, whatever we understood didn’t include any technical words or jargon that Git uses. So, let’s dive into it and relate whatever we understand so far to Git's internal working.

Git jargon

Now it’s time to create a git repository inside the folder we created git-internal.

Before creating a git repository using git init In the command inside the root folder, there is only 1 file. And after initializing git in the folder, it still has that file only, nothing else. As we have discussed earlier, there should be some folder or files that git will use to track the code.

Yes, it creates a folder, but it is hidden because git does not want to expose these folders by default to anyone and track the changes anonymously. So, we can’t see those changes?…

Not really, we can still see those, though the folder is hidden, but we can still access it using the command ls -al or ll in Linux terminal (My system is Ubuntu 22.x version

You can also see the success message that git shows Initialized my Git repository in /folder/path/.git . And after running the command ls -la It not only displays the files and folders in the current directory, including hidden files, but also provides additional details for each file and folder.

Hidden files and folders start with dot (.), just like git have one .git

If we go into this folder and run ls We will see a bunch of files and folders.

branches (folder)

This is the folder that contains all the branch info you create in the repositories.
config (file)

This file contains the local, repository-specific configuration that controls how git operates in that specific repository. This configuration overwrites any global or system-wide configuration for the project.

It stores a bunch of information like:
- Remote repository URL
- Branch-specific configuration.
- User info. (Though the userinfo is globally set in git. But it can also be different for any specific repository/project)
description (file)

It just contains a single line statement Unnamed repository; edit this file 'description' to name the repository . However, it typically contains a description of the project in plain text.
HEAD (file)

This file contains the reference to the latest active branch, which itself references the actual commit hash. (Don’t worry, we are going to see all about hashes, object types, and all).
hooks (folder)

This folder contains some sample scripts, or we can also create our own scripts that can run in the different stages of git. Like just after staging the changes, committing the current stage, before merging branches, or before pushing the changes to the remote repository.
info (folder)

This folder contains local and private information about the repository, which is not intended to be shared with any other repositories or developers. The main file inside this folder is exclude which acts as the private gitignore file, but unlike the gitignore Files are private to the local repository and are not shared on GitHub.
objects (folder)

It is the main culprit inside .git . A folder that contains all the changes inside, which are tracked. There are 3 types of objects in git.
- Commit
- Tree
- Blob

We are going to discuss all three objects in detail, with how git stores them.

refs (folder)

objects folders store the actual tracked files and their SHA-1 hash, which is not easy to remember. Every time we need to access the file history, we need to access it using the SHA-1 hash of the respective files or folders. So, to store the easy and memorable names of these SHA-1 hashes, we use the refs folder.
It contains:
- heads
- tags

Now, we have a decent understanding of what git has inside its folder. Now let’s go through the different states of git to get an understanding of how git flow works, then we will be back to understand Objects (folder) and refs (folder), which we need to understand only for now.

The lifecycle of the status of your files.

The image above is taken from the official git page.

There are a total of 4 states in Git (If we are just using git locally).

Untracked
Unmodified
Modified
Staged

There is one more state called Commit.

When we initialize a git repository in a folder, it is called working directory in Git terms. This contains all the Untracked files initially.

Untracked State

This is the state of files that are not tracked by git yet. Which means git is not aware of these files. We need to tell git explicitly to track this file. So, as in the Git state image shown. We can change the file’s state from Untracked to Staged so that it can get tracked through Git. But before that, we can add some working code so that we have something to work upon.

So, now we have hard-coded the different city weather reports in index.js file.

// index.js
console.log("Weather Report Applications");

const weather_report = {
    "jsr": {
        "city": "Jamshedpur",
        "temperature": "30°C",
        "condition": "Sunny"
    },
    "del": {
        "city": "Delhi",
        "temperature": "35°C",
        "condition": "Hot"
    },
    "mum": {
        "city": "Mumbai",
        "temperature": "28°C",
        "condition": "Humid"
    }
}

Still, the index.js file is untracked. Let’s move it into the staging area or state with the below command.

git add index.js

Instead of manually adding the file in the command git add file1 file2 .. we can add all the files at once using git add . , it will add all the files in the current folder in staged area.

Now, as you can see. Our index.js file is moved to the staging area, and the status of the index.js file is changed to A in the VS Code GitHub sidebar, which says Index Added. Also in the terminal, you can see the file is being tracked.

In git staging area is also called Index. Which mean staging area contain all the files updated codebase and it is ready to commit that means it will be save in the git local storage area which is in objects folder that contain commits, trees and blobs.

Once we are at the staging area, we can now either commit the final changes or update more content → add all the changes to the staging area again, and finally commit it so that Git can store the final changes in its local storage (Object → Blob).

Below is the command to commit the changes.

git commit -m “short message for the changes you made”

So, till now, what we have done.

We created a folder and added the index.js file with some code.
Initialized the git repo in the same folder.
Added the index.js file into the staging area
Commit the staged file

Now we are unmodified where the index.js has been tracked and stored in the git object storage. Simply, we can also tell that “We have committed our first version of the code.”

Next, we can add the new features, fix the bugs, update some code, and again move it tothe staging area and commit another version of the code.

So, let’s complete the code for finding weather information.

const readline = require('readline');

const weather_report = {
    "jsr": {
        "city": "Jamshedpur",
        "temperature": "30°C",
        "condition": "Sunny"
    },
    "del": {
        "city": "Delhi",
        "temperature": "35°C",
        "condition": "Hot"
    },
    "mum": {
        "city": "Mumbai",
        "temperature": "28°C",
        "condition": "Humid"
    }
}

function getWeatherReport(cityCode) {
    const report = weather_report[cityCode];
    if (report) {
        return `City: ${report.city}, Temperature: ${report.temperature}, Condition: ${report.condition}`;
    } else {
        return "City code not found.";
    }
}

const rl = readline.createInterface({
    input: process.stdin,
    output: process.stdout
})

rl.question("Enter city code (jsr, del, mum): ", (cityCode) => {
    const report = getWeatherReport(cityCode)
    console.log(report)
    rl.close();
})

The code is simple to understand.

We have imported readline (An inbuilt node package to take input from the user)
weather_report A constant that stores the weather report of 3 different places.
getWeatherReport() A method that takes the cityCode as an argument and finds the weather of the respective city and returns it in a readable way.
const rl = readline.createInterface({ input: process.stdin, output: process.stdout }) This helps Node.js to create an interface that will help us to take input from the user and show some output in the terminal via standard input and output.
And finally, we are asking the city code from the user and showing them the result.

Don’t worry if you don’t understand the code.

As you can see, as soon as we added some code after making a commit, the file status changed to M which means it is now in a modified state whe

re we have updated the file. This means we have added some code to this file, but what git doesn’t know.

To make Git aware of the changes, we need to stage them again. So let’s do it.

Now, as you can see, Git says clearly that this index.js file is modified and also staged (obviously after doing it). Now, let’s commit it so that the new version of the code is stored in the Git object storage.

Now you can see in the terminal above that the working tree is clean, which means the 2nd version of code is also stored in the Git Object storage system. Also, in VS Code, you can see that there are no symbols on the right side of the file name, which confirms this.