[ERRRO: AttributeError: module 'tensorflow' has no attribute 'app']
(base) maye@maye-Inspiron-5547:~/github_repository/tensorflow_ecosystem/distribution_strategy$ kubectl describe pod dist-strat-example-worker-0-w6rsb
Name: dist-strat-example-worker-0-w6rsb
Namespace: default
Priority: 0
Service Account: default
Node: maye-inspiron-5547/192.168.0.104
Start Time: Sat, 03 Feb 2024 12:56:01 +0800
Labels: job=worker
name=dist-strat-example
task=0
Annotations:
Status: Running
IP: 10.244.0.30
IPs:
IP: 10.244.0.30
Controlled By: ReplicationController/dist-strat-example-worker-0
Containers:
tensorflow:
Container ID: containerd://4d271f040fdfaeebcc6f111fb6fa6666cee129d52cb429a89407c60c3c1180e6
Image: tf_std_server:v1
Image ID: sha256:d39144c35ea9a32641039358493137fdbce32ee5688b2c307cf255d127e6a0ed
Port: 5000/TCP
Host Port: 0/TCP
Command:
/usr/bin/python
/tf_std_server.py
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Sat, 03 Feb 2024 15:38:32 +0800
Finished: Sat, 03 Feb 2024 15:38:37 +0800
Ready: False
Restart Count: 36
Environment:
TF_CONFIG: { "cluster": { "worker": ["dist-strat-example-worker-0:5000","dist-strat-example-worker-1:5000"]}, "task": { "type": "worker", "index": "0" } }
GOOGLE_APPLICATION_CREDENTIALS: /var/secrets/google/key.json
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-b8qlz (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-b8qlz:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
Normal Pulled 59m (x26 over 165m) kubelet Container image "tf_std_server:v1" already present on machine
Warning BackOff 4m55s (x716 over 164m) kubelet Back-off restarting failed container tensorflow in pod dist-strat-example-worker-0-w6rsb_default(a31b45d9-1dbf-43c1-95c6-4f8d8112c5e9)
(base) maye@maye-Inspiron-5547:~/github_repository/tensorflow_ecosystem/distribution_strategy$ kubectl logs dist-strat-example-worker-0-w6rsb
2024-02-03 07:38:33.104872: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Traceback (most recent call last):
File "/tf_std_server.py", line 35, in
tf.app.run()
^^^^^^
AttributeError: module 'tensorflow' has no attribute 'app'
(base) maye@maye-Inspiron-5547:~/github_repository/tensorflow_ecosystem/distribution_strategy$
[SOLUTION]
This is due to that the tensorflow in use is v2, tf.app.run()
is a sentence of tensorflow v1, module app
has been removed in tensorflow v2,
tf.app.run()
= main(sys.argv)
- replace
tf.app.run()
tomain(sys.argv)
, andimport sys
, in file tf_std_server.py . - rebuild the container image of tensorflow standard server:
$ cd <directory-which-contains-Dockerfile.tf_std_server>
$ nerdctl build --no-cache -t tf_std_server:v1 -f Dockerfile.tf_std_server . --namespace k8s.io
Attention:
- The
.
means specifying the current directory as the context, namelynerdctl build
will find files it needs in this directory, if no directory specified, raise error: "FATA[0004] context needs to be specified " . - If not specifying namespace, the built image will be in namespace "default", and crictl (container runtime interface cli of kubernetes) can only see images in namespace "k8s.io" .
Note:
- "exit code: 1" : something wrong in executing code of the process.
"exit code: 137": the process has received SIGNAL KILL, in the case of kubernetes, if kubelet needs to stop a container process, it will call containerd, and containerd will send SIGNAL KILL to the container process. Linux will send SIGNAL KILL to a process if cpu, or memory is not enough.
"exit code: 139': SEGMENT FAULT, the process tries to access memory, or file, or table in a database which is not accessible, such as, memory out of boundary, not existed file or database table.