首页 > 其他分享 >【Docker】从命名空间和路由角度探究Docker的bridge网络

【Docker】从命名空间和路由角度探究Docker的bridge网络

时间:2023-12-03 11:07:35浏览次数:45  
标签:bridge 00 172.17 -- 0.0 bytes ff Docker 路由


桥接网络是Docker的默认网络模式。在桥接网络中,Docker会为每个容器创建一个虚拟网络接口,并为容器分配一个IP地址。容器可以通过桥接网络与主机和其他容器进行通信,也能暴露端口供外部访问。

容器之间的通信原理

首先我们创建两个容器:

$ docker container run -d --rm --name box1 busybox /bin/sh -c "while true; do sleep 3600; done"
e6e89f95de12eeda726fed5f4f909d32be2ea13c3cecb350acd86bc13394b769

$ docker container run -d --rm --name box2 busybox /bin/sh -c "while true; do sleep 3600; done"
c0c1a152155bcf66bed71fdc51e558f4c3b1c3632866c61a69303a4da10c2f54

$ docker container ls
CONTAINER ID   IMAGE     COMMAND                  CREATED          STATUS          PORTS     NAMES
c0c1a152155b   busybox   "/bin/sh -c 'while t…"   31 seconds ago   Up 30 seconds             box2
e6e89f95de12   busybox   "/bin/sh -c 'while t…"   41 seconds ago   Up 40 seconds             box1

然后我们在容器box1中尝试ping通容器box2:

$ docker container exec -it box2 ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0
3: sit0@NONE: <NOARP> mtu 1480 qdisc noop qlen 1000
    link/sit 0.0.0.0 brd 0.0.0.0
21: eth0@if22: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue
    link/ether 02:42:ac:11:00:03 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.3/16 brd 172.17.255.255 scope global eth0
       valid_lft forever preferred_lft forever

$ docker container exec -it box1 ping 172.17.0.3 -c 3
PING 172.17.0.3 (172.17.0.3): 56 data bytes
64 bytes from 172.17.0.3: seq=0 ttl=64 time=0.886 ms
64 bytes from 172.17.0.3: seq=1 ttl=64 time=0.049 ms
64 bytes from 172.17.0.3: seq=2 ttl=64 time=0.106 ms

--- 172.17.0.3 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.049/0.347/0.886 ms

为什么在box1中能ping通box2呢?容器之间是怎么通讯的呢?

Docker是使用namespace实现网络,计算等资源的隔离,但是为什么使用ip netns命令却无法在主机上看到任何network namespace呢?

这是因为Docker默认把创建的网络命名空间链接文件隐藏起来了,导致ip netns命令无法读取,给分析网络原理和排查问题带来了麻烦。

下面是恢复netns命名空间的办法。

执行下面的命令来获取容器进程号:

$ docker inspect box1 | grep Pid
            "Pid": 43568,
            "PidMode": "",
            "PidsLimit": null,

$ docker inspect box2 | grep Pid
            "Pid": 43640,
            "PidMode": "",
            "PidsLimit": null,

执行如下命令,将进程网络命名空间恢复到主机目录:

$ ln -s /proc/43568/ns/net /var/run/netns/box1

$ ln -s /proc/43640/ns/net /var/run/netns/box2

如果/var/run/netns目录不存在,以root用户手动创建目录即可。

然后执行ip netns命令即可看到容器的网络命名空间:

$ ip netns list
box2 (id: 3)
box1 (id: 2)

查看网络命名空间box1和box2的IP地址:

$ ip netns exec box1 ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0
3: sit0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
    link/sit 0.0.0.0 brd 0.0.0.0
19: eth0@if20: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.17.0.2/16 brd 172.17.255.255 scope global eth0
       valid_lft forever preferred_lft forever

$ ip netns exec box2 ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0
3: sit0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
    link/sit 0.0.0.0 brd 0.0.0.0
21: eth0@if22: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:ac:11:00:03 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.17.0.3/16 brd 172.17.255.255 scope global eth0
       valid_lft forever preferred_lft forever

发现网络命名空间box1的IP为172.17.0.2,网络命名空间box2的IP为172.17.0.3,要想实现两个相同网段的网络命名空间的通信,需要借助bridge。

Docker默认会创建一个名为docker0的bridge:

$ ip link show type bridge
9: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
    link/ether 02:42:53:1d:f7:5f brd ff:ff:ff:ff:ff:ff

然后查看一下docker0的veth网口:

$ brctl show docker0
bridge name     bridge id               STP enabled     interfaces
docker0         8000.0242531df75f       no              vetha7d1dd5
                                                        vethadaa66f

docker0有两个veth网口:vetha7d1dd5、vethadaa66f

再来主机上看下veth网口:

$ ip link show type veth
20: vethadaa66f@if19: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default
    link/ether 52:4c:41:8c:91:01 brd ff:ff:ff:ff:ff:ff link-netns box1
22: vetha7d1dd5@if21: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default
    link/ether 8a:e9:19:ce:72:cb brd ff:ff:ff:ff:ff:ff link-netns box2

我们可以看到网络命名空间box1通过veth paireth0(if19)-vethadaa66f(if20)连接bridge0,网络命名空间box2通过veth paireth0(if21)-vetha7d1dd5(if22)连接bridge0,这样网络命名空间box1和网络命名空间box2就能进行通讯了。

来个网络拓扑图:

【Docker】从命名空间和路由角度探究Docker的bridge网络_桥接

容器访问外部网络原理

单靠网络命名空间+bridge只能实现网络命名空间之前的通讯,容器想要访问外部网络还需要借助iptables实现SNAT。

在box1中ping百度:

$ docker exec -it box1 ping www.baidu.com -c 3
PING www.baidu.com (14.119.104.189): 56 data bytes
64 bytes from 14.119.104.189: seq=0 ttl=51 time=9.908 ms
64 bytes from 14.119.104.189: seq=1 ttl=51 time=14.939 ms
64 bytes from 14.119.104.189: seq=2 ttl=51 time=11.023 ms

--- www.baidu.com ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 9.908/11.956/14.939 ms

查看iptables的规则:

$ iptables -nvxL -t nat
Chain PREROUTING (policy ACCEPT 20 packets, 3083 bytes)
    pkts      bytes target     prot opt in     out     source               destination
       0        0 DOCKER     all  --  *      *       0.0.0.0/0            0.0.0.0/0            ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT 1 packets, 229 bytes)
    pkts      bytes target     prot opt in     out     source               destination

Chain OUTPUT (policy ACCEPT 2 packets, 137 bytes)
    pkts      bytes target     prot opt in     out     source               destination
       0        0 DOCKER     all  --  *      *       0.0.0.0/0           !127.0.0.0/8          ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT 2 packets, 137 bytes)
    pkts      bytes target     prot opt in     out     source               destination
       6      300 MASQUERADE  all  --  *      !docker0  172.17.0.0/16        0.0.0.0/0

Chain DOCKER (2 references)
    pkts      bytes target     prot opt in     out     source               destination
       0        0 RETURN     all  --  docker0 *       0.0.0.0/0            0.0.0.0/0

发现nat表的POSTROUTING链中有一条规则是对源地址为172.17.0.0/16的网段进行SNAT转换,这样就可以跟外部网络进行通讯了。

我们清空iptables的所有规则:

$ iptables -t filter -F
$ iptables -t filter -X
$ iptables -t filter -Z
$ iptables -t nat -F
$ iptables -t nat -X
$ iptables -t nat -Z

再次查看所有的规则,发现规则和自定义链已经清空了:

$ iptables -t filter -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination

$ iptables -t nat -L
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination

Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination

再次尝试访问百度,无法访问:

$ docker exec -it box1 ping www.baidu.com -c 3
ping: bad address 'www.baidu.com'

我们手动用iptables增加一条nat规则:

$ iptables -t nat -A POSTROUTING -s 172.17.0.0/16 -j MASQUERADE

再次访问百度,发现已经可以通讯了:

$ docker exec -it box1 ping www.baidu.com -c 3
PING www.baidu.com (14.119.104.189): 56 data bytes
64 bytes from 14.119.104.189: seq=0 ttl=51 time=16.015 ms
64 bytes from 14.119.104.189: seq=1 ttl=51 time=9.960 ms
64 bytes from 14.119.104.189: seq=2 ttl=51 time=9.247 ms

--- www.baidu.com ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 9.247/11.740/16.015 ms

有时filter表的FORWARD链的默认执行策略是DROP,我们需要手动将这个默认执行策略改为ACCEPT才能通讯,使用如下命令:

$ iptables -P FORWARD ACCEPT

现在因为我们暴力执行iptables -F导致docker的规则全清,想还原Docker的默认规则该怎么办呢?使用如下命令重启Docker即可:

$ service docker restart

当然不嫌麻烦,也可以手动一条一条将规则添加上。

端口转发原理

在容器创建时可以使用-p参数指定将主机的端口映射到容器的端口,从而实现将访问主机端口的请求转发到容器内部。

首先创建一个nginx的web容器,并指定将主机的端口8080映射到容器的80端口:

$ docker container run -d --rm --name web -p 8080:80 nginx
441c77091abfeb9498d4fd21d62594d75363fb42338c4ec51a42b6f01d80e418

访问主机的8080端口,发现成功请求到容器内部:

$ curl localhost:8080
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

这种端口转发是怎么实现的呢?还是通过我们的老朋友iptables实现的。这里使用的是iptables实现DNAT。

查询iptables的规则:

$ iptables -t nat -nvL
Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
    0     0 DOCKER     all  --  *      *       0.0.0.0/0            0.0.0.0/0            ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
    0     0 DOCKER     all  --  *      *       0.0.0.0/0           !127.0.0.0/8          ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
    0     0 MASQUERADE  all  --  *      !docker0  172.17.0.0/16        0.0.0.0/0
    0     0 MASQUERADE  tcp  --  *      *       172.17.0.2           172.17.0.2           tcp dpt:80

Chain DOCKER (2 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 RETURN     all  --  docker0 *       0.0.0.0/0            0.0.0.0/0
    0     0 DNAT       tcp  --  !docker0 *       0.0.0.0/0            0.0.0.0/0            tcp dpt:8080 to:172.17.0.2:80

我们可以发现在nat表的POSTROUTING链增加了如下规则,主要用于web容器可以访问外部网络:

Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
    0     0 MASQUERADE  tcp  --  *      *       172.17.0.2           172.17.0.2           tcp dpt:80

还在DOCKER链(被PREROUTING引用)中增加了如下规则,用于将主机8080端口的请求转发到172.17.0.2:80

Chain DOCKER (2 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 DNAT       tcp  --  !docker0 *       0.0.0.0/0            0.0.0.0/0            tcp dpt:8080 to:172.17.0.2:80

下面我们启动一个容器时尝试不指定-p参数配置端口转发,手动通过iptables配置规则实现端口转发。

启动一个nginx镜像的web容器,不指定端口转发:

$ docker container run -d --rm --name web nginx

此时查看iptables的规则,发现除了docker的基础规则,并未添加新的转发规则:

$ iptables -t nat -nvL
Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
    0     0 DOCKER     all  --  *      *       0.0.0.0/0            0.0.0.0/0            ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
    0     0 DOCKER     all  --  *      *       0.0.0.0/0           !127.0.0.0/8          ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
    0     0 MASQUERADE  all  --  *      !docker0  172.17.0.0/16        0.0.0.0/0

Chain DOCKER (2 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 RETURN     all  --  docker0 *       0.0.0.0/0            0.0.0.0/0

此时访问主机的8080端口也是不通的:

$ curl 172.19.85.122:8080
curl: (7) Failed to connect to 172.19.85.122 port 8080: Connection refused

添加DNAT规则:

$ iptables -t nat -I DOCKER ! -i docker0 -p tcp --dport 8080 -j DNAT --to-destination 172.17.0.2:80

$ iptables -t nat -nvL
Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
    0     0 DOCKER     all  --  *      *       0.0.0.0/0            0.0.0.0/0            ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
    2   120 DOCKER     all  --  *      *       0.0.0.0/0           !127.0.0.0/8          ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
    0     0 MASQUERADE  all  --  *      !docker0  172.17.0.0/16        0.0.0.0/0

Chain DOCKER (2 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 DNAT       tcp  --  !docker0 *       0.0.0.0/0            0.0.0.0/0            tcp dpt:8080 to:172.17.0.2:80
    0     0 RETURN     all  --  docker0 *       0.0.0.0/0            0.0.0.0/0

此时可以通过主机的8080端口访问到web容器了:

$ curl 172.19.85.122:8080
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>


标签:bridge,00,172.17,--,0.0,bytes,ff,Docker,路由
From: https://blog.51cto.com/morris131/8665656

相关文章

  • docker 安装mysql 8.0.26
    sudodockerpullmysql:8.0.26创建数据目录和配置文件:在你的宿主机上创建一个目录来存放MySQL的配置文件和数据。你还需要为这个目录设置适当的权限1:sudomkdir-p/data/mysql8.0/conf/data/mysql8.0/data/data/mysql8.0/logssudochmod-R755/data/mysql8.0/conf/data/......
  • 快速配置mysql(非docker)
    蠢新从大二开始用起mysql,直到工作了还需要去网上找博客查怎么装。Windows不管,因为我自己的工位电脑已经有了。以下的操作为假设你有一台腾讯云的服务器,或者版本至少在20以上的Ubuntu。root用户登录。使用包管理器安装mysql8.0apt-getupdateapt-getinstallmysql-serversys......
  • mac上面运行docker
    docker简介docker是一种容器技术。参考:https://cloud-atlas.readthedocs.io/zh-cn/latest/docker/startup/introduce_docker.html在mac中启动docker服务launchctllist|grepdocker#查看是否启动了服务open/Applications/Docker.app&#如果已经通过img安装了dock......
  • [Docker]如何添加文件卷到已存在的docker容器
    情景描述在业务部署的前期,docker只映射了一部分文件卷,在业务运行一段时间后,发现还有新的文件卷需要被映射,那如何快速实现?对于一个已经存在的容器Container,添加文件卷不能像新建容器时那样直接使用-v参数操作。操作前提是:你至少需要关停容器一次,需要计划停机时间和日期(规划),通知受影......
  • 路由器拨号上网
       1电脑网线直连路由器上网口1-6随便一个,路由器的wan口链接学校的网线2电脑网页访问192.168.1.1  输入上网账号密码Ld5 设置wifi名字和密码完成自动开始链接查看状态 上不去多连接几次,选择正常拨号模式 确保电脑的时间和当前网络时间同步......
  • 总结-解决国内服务器、nas 、docker访问国外网站、更新镜像、遇到的问题
    proxy可以通过修改环境变量,添加代理协议、服务器ip和端口,可以解决访问github、google等网站的问题,同时会遇到国内外分流、ipv6访问等问题。详细可以寻找projectX。解决DNS的问题运营商的dns存在着污染的情况,导致一些网页解析到了无法访问的ip,可以通过以下方法解决。修改DNS......
  • Docker极简入门
    Ubuntu安装Dockersudoaptinstalldocker.io开启Docker服务sudosystemctlenabledockersudosystemctlstartdocker为当前用户赋予Docker用户组权限sudogroupadddockersudousermod-aGdocker${USER}newgrpdocker使用ps命令,该命令的功能是列出所有容,检查Docke......
  • .NET Core|--调用C++库|--docker环境下让web api应用程序调用C++类库
    前言#前提安装docker环境~启动docker~#多说一句,为什么我要搞这个一个镜像,既包含gcc开发环境,又包含.NET开发环境我的api应用程序是基于.NET写的,但是我的这个api程序,又要调用c++的一些东西,特别是涉及一些画图之类的,所以就需要gcc的开发环境,最终搞了这么一......
  • Docker|--镜像中既有gcc和.NET运行时, 但是容器启动的时候报错 exec: "dotnet": exec
    基本信息#镜像gcc_for_net7_image是如何产生的,分为3步1.基于gcc的镜像运行起来的一个包含了gcc环境的容器,2.在这个容器里安装了.NET7运行时,3.再将这个包含了gcc环境和.NET7的容器打包为一个镜像"gcc_for_net7_image"总之,这个镜像"gcc_for_net7_image"既包含了gcc......
  • docker-compose version 版本匹配
     version:'3.8'services:rmqnamesrv:image:apache/rocketmq:5.1.0container_name:rmqnamesrvports:-9876:9876restart:alwaysprivileged:truevolumes:-/usr/local/rocketmq/nameserver/logs:/home/rocket......