0%

Git submodule 的使用

git submodule 的使用

项目中经常会使用到第三方的 git 库, 将三方库整合到项目中最简单的办法就是复制粘贴, 但是如果这个库升级了一个很酷炫的功能, 你要怎么整合进来呢?(其实就是 git 版的包管理器)

这就是 git-submodule 的功能, 直接把第三方的版本库合并到自己的库中.

添加第三方库

比如自己开了两个库做测试, 主库叫 main, 另一个库叫 sub

首先在本地的main库中添加sub

1
2
3
$ git clone https://www.github.com/username/main.git
$ cd main
$ git submodule add https://www.github.com/username/sub.git <directory_name>

这时查看下状态会多两个文件

1
2
3
4
5
6
7
8
9
➜  main git:(master) ✗ git status
On branch master
Your branch is ahead of 'origin/master' by 1 commit.
(use "git push" to publish your local commits)
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)

new file: .gitmodules
new file: sub

这就多了一个 sub的库, 和一个.gitmodules的文件, 现在提交一下

1
$ git commit -am "add sub"

在其他地方使用合并后的版本库

本地提交了版本之后可以提交到远程试一下

1
$ git push

这时去远程库中看的话库中的内容是这样的

这里有个奇怪的 sub @ 2b79b47, 明明是没有的啊?
点一下原来是一个快捷方式, 直接给连接到了 sub库的地址, 版本库中不会存第三方引入库的实体文件, 而是通过 .gitmodules的方式存储三方的联系方式, 当下载到本地运行的时候才会再拉取文件

而且这个时候在其他的地方安装main这个库的时候直接运行 git clone 是生成不了完整的文件的, 缺少了 sub库的文件
因为这个时候的 main/sub目录是空的需要多走一步, 这时为什么呢? 我们下面会讲到原因

1
2
3
4
5
$ git clone the/path/of/main.git
$ git submodule init && git submodule update

#下面这一句的效果和上面三条命令的效果是一样的,多加了个参数 `--recursive`
$ git clone the/path/of/main.git --recursive

这时才是一个完整的库

将三方库同步到主线

之前的一些步骤其实还不完整, 因为 main/sub 这个目录中的文件并没有和主线在一条线上, 这也是为什么在远程库的 sub 目录是空的, 因为在 master 分支里面它确实是空的, 文件是在另一个分支上, 我们先去看一下

1
2
3
4
cd path/to/main/sub
➜ sub git:(2b79b47) git branch
* (HEAD detached at 2b79b47)
master

别的文件的分支都是 master 到这个文件的时候就是 2b79b47分支了, 其实这个值也是 sub库当前的 commitId
而且如果不把第三方的库纳入自己的主线的话会非常的危险, 因为你对项目中的三方库发生的任何改动都不会对主线产生任何影响, 被主线遗忘了, 因此我们还需要接下来的操作

1
2
cd path/to/main/sub
git checkout master

更新第三方库

这里有个问题就是如果main/sub发生了更新就首先在这个文件中提交一个commit, 然后在main这个目录下再 commit一次
第一次 commit 是为了更新 sub的版本控制, 第二次更新是更新main的版本控制, 同时更新 sub库在main的指针

如果更新的比较多, 可以运行

批量更新第三方库

假设你的项目当中引入了 100 个第三方的库, 你需要同步的时候难道还要每一个都要执行:

1
2
3
$ cd module-dir/
$ git checkout master
$ git pull

这些东西 git 早就帮你想好了
具体操作可以看一下git help submodule有相关的介绍的

1
2
3
git submodule foreach <command>
比如:
git submodule foreach git checkout master

这条命令就会按照 .gitmodules会根据path寻找所有的三方模块, 并在每一个模块中都执行 foreach 后的命令,
比如你想批量更新模块到最新的时候就:

1
git submodule foreach git submodule update

怎么删除 submodule?

在当前 git 版本1.7.8之前, 删除指定的 submodule 的命令是

1
git rm <submodule-name>

在新版的 git 下, 则是运行以下命令

1
2
3
4
$ git submodule deinit -f — mymodule
$ rm -rf .git/modules/mymodule
$ git rm -f mymodule

查看本地有哪些三方模块可以查看 .gitmodules

1
2
3
4
➜  maint  git:(master) cat .gitmodules
[submodule "sub"]
path = sub
url = the/path/of/sub.git

拉取所有子模块

git submodule foreach git pull

git submodule foreach –recursive git submodule init

git submodule foreach –recursive git submodule update

1
$ git submodule add <url> <path>
  • url:替换为自己要引入的子模块仓库地址
  • path:要存放的本地路径

执行添加命令成功后,可以在当前路径中看到一个.gitsubmodule文件,里面的内容就是我们刚刚add的内容

如果在添加子模块的时候想要指定分支,可以利用 -b 参数

1
$ git submodule add -b <branch> <url> <path>

未指定分支

1
$ git submodule add https://github.com/tensorflow/benchmarks.git 3rdparty/benchmarks

.gitsubmodule内容

1
2
3
[submodule "3rdparty/benchmarks"]
path = 3rdparty/benchmarks
url = https://github.com/tensorflow/benchmarks.git

指定分支

1
$ git submodule add -b cnn_tf_v1.10_compatible https://github.com/tensorflow/benchmarks.git 3rdparty/benchmarks

.gitsubmodule内容

1
2
3
4
[submodule "3rdparty/benchmarks"]
path = 3rdparty/benchmarks
url = https://github.com/tensorflow/benchmarks.git
branch = cnn_tf_v1.10_compatible

使用

当我们add子模块之后,会发现文件夹下没有任何内容。这个时候我们需要再执行下面的指令添加源码。

1
$ git submodule update --init --recursive

这个命令是下面两条命令的合并版本

1
2
$ git submodule init
$ git submodule update

更新

我们引入了别人的仓库之后,如果该仓库作者进行了更新,我们需要手动进行更新。即进入子模块后,执行

1
2
3
4
5
6
7
8
9
$ git pull
或者在根目录执行
# 远程仓库更新以后
# fatal: Needed a single revision
# fatal: Unable to find current revision in submodule path 'xxx'

$ git pull --recurse-submodules
#或者
$ git submodule foreach git pull origin master

进行更新。

删除

  1. 删除子模块目录及源码
1
$ rm -rf 子模块目录
  1. 删除.gitmodules中的对应子模块内容
1
$ vi .gitmodules
  1. 删除.git/config配置中的对应子模块内容
1
$ vi .git/config
  1. 删除.git/modules/下对应子模块目录
1
$ rm -rf .git/modules/子模块目录
  1. 删除git索引中的对应子模块
1
$ git rm --cached 子模块目录

TO-read

Using submodules in Git - Tutorial (vogella.com)

Using submodules in Git - Tutorial

Lars Vogel (c) 2009-2022 vogella GmbHVersion 5.8,10.08.2015

TABLE OF CONTENTS

-

\1. Submodules - repositories inside other Git repositories

-

\2. Working with repositories that contain submodules

-

\3. Creating repositories with submodules

  • \4. Links and Literature

This tutorial explains the usage of submodules with the Git version control system.

Learn more in the Learning Portal. Check out ourGit Online Training priority_high

1. Submodules - repositories inside other Git repositories

1.1. Using Git repositories inside other Git repositories

Git allows you to include other Git repositories called submodules into a repository. This allows you to track changes in several repositories via a central one. Submodules are Git repositories nested inside a parent Git repository at a specific path in the parent repository’s working directory. A submodule can be located anywhere in a parent Git repository’s working directory and is configured via a .gitmodules file located at the root of the parent repository. This file contains which paths are submodules and what URL should be used when cloning and fetching for that submodule. Submodule support includes support for adding, updating, synchronizing, and cloning submodules.

Git allows you to commit, pull and push to these repositories independently.

Submodules allow you to keep projects in separate repositories but still be able to reference them as folders in the working directory of other repositories.

2. Working with repositories that contain submodules

2.1. Cloning a repository that contains submodules

If you want to clone a repository including its submodules you can use the --recursive parameter.

1
git clone --recursive [URL to Git repo]

2.2. Downloading multiple submodules at once

Since a repository can include many submodules, downloading them all sequentially can take much time. For this reason clone and submodule update command support the --jobs parameter to fetch multiple submodules at the same time.

1
2
3
4
5
# download up to 8 submodules at once
git submodule update --init --recursive --jobs 8
git clone --recursive --jobs 8 [URL to Git repo]
# short version
git submodule update --init --recursive -j 8

2.3. Pulling with submodules

Once you have set up the submodules you can update the repository with fetch/pull like you would normally do. To pull everything including the submodules, use the --recurse-submodules and the --remote parameter in the git pull command.

1
2
3
4
5
# pull all changes in the repo including changes in the submodules
git pull --recurse-submodules

# pull all changes for the submodules
git submodule update --remote

2.4. Executing a command on every submodule

Git provides a command that lets us execute an arbitrary shell command on every submodule. To allow execution in nested subprojects the --recursive parameter is supported. For our example we assume that we want to reset all submodules.

1
2
3
git submodule foreach 'git reset --hard'
# including nested submodules
git submodule foreach --recursive 'git reset --hard'

3. Creating repositories with submodules

3.1. Adding a submodule to a Git repository and tracking a branch

If you add a submodule, you can specify which branch should be tracked via the -b parameter of the submodule add command. The git submodule init command creates the local configuration file for the submodules, if this configuration does not exist.

1
2
3
# add submodule and define the master branch as the one you want to track
git submodule add -b master [URL to Git repo]
git submodule init
adds a new submodule to an existing Git repository and defines that the master branch should be tracked
initialize submodule configuration

If you track branches in your submodules, you can update them via the --remote parameter of the git submodule update command. This pulls in new commits into the main repository and its submodules. It also changes the working directories of the submodules to the commit of the tracked branch.

1
2
3
# update your submodule --remote fetches new commits in the submodules
# and updates the working tree to the commit described by the branch
git submodule update --remote

3.2. Adding a submodule and tracking commits

Alternatively to the tracking of a branch, you can also control which commit of the submodule should be used. In this case the Git parent repository tracks the commit that should be checked out in each configured submodule. Performing a submodule update checks out that specific revision in the submodule’s Git repository. You commonly perform this task after you pull a change in the parent repository that updates the revision checked out in the submodule. You would then fetch the latest changes in the submodule’s Git repository and perform a submodule update to check out the current revision referenced in the parent repository. Performing a submodule update is also useful when you want to restore your submodule’s repository to the current commit tracked by the parent repository. This is common when you are experimenting with different checked out branches or tags in the submodule and you want to restore it back to the commit tracked by the parent repository. You can also change the commit that is checked out in each submodule by performing a checkout in the submodule repository and then committing the change in the parent repository.

You add a submodule to a Git repository via the git submodule add command.

1
2
git submodule add [URL to Git repo] 
git submodule init
adds a submodule to the existing Git repository
initialize submodule configuration

3.3. Updating which commit your are tracking

The relevant state for the submodules are defined by the main repository. If you commit in your main repository, the state of the submodule is also defined by this commit.

The git submodule update command sets the Git repository of the submodule to that particular commit. The submodule repository tracks its own content which is nested into the main repository. The main repository refers to a commit of the nested submodule repository.

Use the git submodule update command to set the submodules to the commit specified by the main repository. This means that if you pull in new changes into the submodules, you need to create a new commit in your main repository in order to track the updates of the nested submodules.

The following example shows how to update a submodule to its latest commit in its master branch.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# update submodule in the master branch
# skip this if you use --recurse-submodules
# and have the master branch checked out
cd [submodule directory]
git checkout master
git pull

# commit the change in main repo
# to use the latest commit in master of the submodule
cd ..
git add [submodule directory]
git commit -m "move submodule to latest commit in master"

# share your changes
git push

Another developer can get the update by pulling in the changes and running the submodules update command.

1
2
3
4
5
6
# another developer wants to get the changes
git pull

# this updates the submodule to the latest
# commit in master as set in the last example
git submodule update
With this setup you need to create a new commit in the master repository, to use a new state in the submodule. You need to repeat this procedure every time you want to use another state in one of the submodules. See Adding a submodule to a Git repository and tracking a branch for tracking a certain branch of a submodule.
处无为之事,行不言之教;作而弗始,生而弗有,为而弗恃,功成不居!

欢迎关注我的其它发布渠道