什么是 Jupyterhub?
JupyterHub is the best way to serve Jupyter notebook for multiple users. Because JupyterHub manages a separate Jupyter environment for each user, it can be used in a class of students, a corporate data science group, or a scientific research group. It is a multi-user Hub that spawns, manages, and proxies multiple instances of the single-user Jupyter notebook server.
JupyterHub is made up of four subsystems:
- a Hub (tornado process) that is the heart of JupyterHub
- a configurable http proxy (node-http-proxy) that receives the requests from the client’s browser
- multiple single-user Jupyter notebook servers (Python/IPython/tornado) that are monitored by Spawners
- an authentication class that manages how users can access the system
JupyterHub performs the following functions:
- The Hub launches a proxy
- The proxy forwards all requests to the Hub by default
- The Hub handles user login and spawns single-user servers on demand
- The Hub configures the proxy to forward URL prefixes to the single-user notebook servers
安装 Jupyterhub
jupyterhub 的 docker 镜像仓库:https://hub.docker.com/r/jupyterhub/jupyterhub
部署前需要考虑的问题:
- deployment system (bare metal, Docker)
- Authentication (PAM, OAuth, etc.)
- Spawner of singleuser notebook servers (Docker, Batch, etc.)
- Services (nbgrader, etc.)
- JupyterHub database (default SQLite; traditional RDBMS such as PostgreSQL,) MySQL, or other databases supported by SQLAlchemy)
目录与本地文件
It is recommended to put all of the files used by JupyterHub into standard UNIX filesystem locations.
-
/srv/jupyterhub
for all security and runtime files -
/etc/jupyterhub
for all configuration files -
/var/log
for log files
部署
/etc/jupyterhub
严重怀疑这东西是否准确,因为 jupyterhub_config.py 在该目录下时不生效,在 /srv/jupyterhub
能正常生效
# 创建 jupyterhub 的网络
docker network create --driver bridge jupyterhub_network
# 创建 volume
mkdir -pv /data/jupyterhub
chown -R root /data/jupyterhub
chmod -R 777 /data/jupyterhub
# 默认配置启动 jupyterhub
docker run -d --name jupyterhub -p8000:8000 --network jupyterhub_network -v /var/run/docker.sock:/var/run/docker.sock -v /data/jupyterhub:/srv/jupyterhub jupyterhub/jupyterhub:latest
docker exec -it jupyterhub bash
npm install --no-cache oauthenticator
配置 Authentication —— GitLabOAuthenticator
jupyterhub OAuthenticator 文档
https://oauthenticator.readthedocs.io/en/latest/tutorials/install.html
vi /data/jupyterhub/jupyterhub_config.py 添加以下信息
import os
from oauthenticator.gitlab import GitLabOAuthenticator
c.JupyterHub.authenticator_class = GitLabOAuthenticator
os.environ['OAUTH_CALLBACK_URL'] = 'http://ip:8000/hub/oauth_callback'
os.environ['GITLAB_CLIENT_ID'] = '***'
os.environ['GITLAB_CLIENT_SECRET'] = '***'
os.environ['GITLAB_URL']='https://xxx.github.cn'
os.environ['GITLAB_HOST']='https://xxx.github.cn'
c.GitlabOAuthenticator.client_id = os.environ['GITLAB_CLIENT_ID']
c.GitlabOAuthenticator.client_secret = os.environ['GITLAB_CLIENT_SECRET']
c.GitLabOAuthenticator.gitlab_url = os.environ['GITLAB_URL']
c.GitLabOAuthenticator.oauth_callback_url = os.environ['OAUTH_CALLBACK_URL']
#查看 jupyterhub 的容器日志
docker logs jupyterhub
[I 2023-02-03 08:03:01.030 JupyterHub roles:477] Adding role server to token: <APIToken('a6e1...', user='xxx', client_id='jupyterhub')>
[I 2023-02-03 08:03:01.035 JupyterHub provider:607] Creating oauth client jupyterhub-user-xxx
[E 2023-02-03 08:03:01.046 JupyterHub user:762] Unhandled error starting chenyuzhe's server: "getpwnam(): name not found: 'xxx'"
[E 2023-02-03 08:03:01.059 JupyterHub pages:311] Error starting server xxx: "getpwnam(): name not found: 'xxx'"
Traceback (most recent call last):
None: None
[W 2023-02-03 08:03:01.059 JupyterHub web:1787] 500 GET /hub/spawn/xxx (): Unhandled error starting server xxx
...
[E 2023-02-03 08:03:01.061 JupyterHub log:189] 500 GET /hub/spawn/xxx (xxx@) 36.46m
嗯,虽然报错了,但这已经授权成功了。下一步配置 Spawner。
配置 Spawner —— DockerSpawner
看一下官网的 DockerSpawner 的配置:https://jupyterhub-dockerspawner.readthedocs.io/en/latest/spawner-types.html#dockerspawner
vi /data/jupyterhub/jupyterhub_config.py 添加以下信息
c.JupyterHub.spawner_class = 'dockerspawner.SwarmSpawner'
进入 jupyterhub 容器执行 npm install dockerspawner 后,重启容器
docker logs jupyterhub 查看错误日志
[I 2023-02-03 08:27:26.224 JupyterHub roles:477] Adding role server to token: <APIToken('a69c...', user='xxx', client_id='jupyterhub')>
[I 2023-02-03 08:27:26.232 JupyterHub provider:607] Creating oauth client jupyterhub-user-xxx
[I 2023-02-03 08:27:26.256 JupyterHub dockerspawner:1218] pulling image jupyterhub/singleuser:2.0
[I 2023-02-03 08:27:27.218 JupyterHub log:189] 302 GET /hub/spawn/xxx -> /hub/spawn-pending/xxx (chenyuzhe@::ffff:10.100.228.17) 1019.64ms
[I 2023-02-03 08:27:27.241 JupyterHub pages:400] xxx is pending spawn
[I 2023-02-03 08:27:27.264 JupyterHub log:189] 200 GET /hub/spawn-pending/xxx (xxx@::ffff:10.100.228.17) 25.94ms
[W 2023-02-03 08:27:36.218 JupyterHub base:1044] User xxx is slow to start (timeout=10)
[W 2023-02-03 08:28:26.246 JupyterHub user:754] xxx's server failed to start in 60 seconds, giving up.
Common causes of this timeout, and debugging tips:
1. Everything is working, but it took too long.
To fix: increase `Spawner.start_timeout` configuration
to a number of seconds that is enough for spawners to finish starting.
2. The server didn't finish starting,
or it crashed due to a configuration issue.
Check the single-user server's logs for hints at what needs fixing.
[I 2023-02-03 08:29:21.222 JupyterHub dockerspawner:988] Container 'jupyter-xxx' is gone
[W 2023-02-03 08:29:21.222 JupyterHub dockerspawner:963] Container not found: jupyter-xxx
[E 2023-02-03 08:29:21.249 JupyterHub gen:623] Exception in Future <Task finished name='Task-10' coro=<BaseHandler.spawn_single_user.<locals>.finish_user_spawn() done, defined at /usr/local/lib/python3.8/dist-packages/jupyterhub/handlers/base.py:935> exception=TimeoutError('Timeout')> after timeout
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/tornado/gen.py", line 618, in error_callback
future.result()
File "/usr/local/lib/python3.8/dist-packages/jupyterhub/handlers/base.py", line 942, in finish_user_spawn
await spawn_future
File "/usr/local/lib/python3.8/dist-packages/jupyterhub/user.py", line 780, in spawn
raise e
File "/usr/local/lib/python3.8/dist-packages/jupyterhub/user.py", line 679, in spawn
url = await gen.with_timeout(timedelta(seconds=spawner.start_timeout), f)
tornado.util.TimeoutError: Timeout
[I 2023-02-03 08:29:21.250 JupyterHub dockerspawner:988] Container 'jupyter-xxx' is gone
[I 2023-02-03 08:29:21.254 JupyterHub log:189] 200 GET /hub/api/users/xxx/server/progress (xxx@::ffff:10.100.228.17) 113851.95ms
[I 2023-02-03 08:29:22.759 JupyterHub dockerspawner:1272] Created container jupyter-xxx (id: 69653f3) from image jupyterhub/singleuser:2.0
[I 2023-02-03 08:29:22.759 JupyterHub dockerspawner:1296] Starting container jupyter-xxx (id: 69653f3)
[root@chenyuzhe jupyterhub]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
69653f3de27d jupyterhub/singleuser:2.0 "tini -g -- start-no…" About a minute ago Up About a minute 127.0.0.1:49153->8888/tcp jupyter-xxx
发现了一下关键信息
- dockerspawner 自觉的帮我拉了 jupyterhub/singleuser:2.0 镜像
- 启动了用 jupyterhub/singleuser:2.0 启动了 jupyter-xxx 容器,但是挂了没起来
- 提示让我们去看 jupyter-xxx 的日志,那就看看 标签:03,部署,Jupyterhub,xxx,2023,JupyterHub,08,jupyterhub From: https://blog.51cto.com/u_15946369/6035967