-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathdecals_train64_error
97 lines (95 loc) · 23.8 KB
/
decals_train64_error
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
2021-09-07 12:39:42.985367: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2021-09-07 12:39:43.025580: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:1a:00.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-09-07 12:39:43.026309: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 1 with properties:
pciBusID: 0000:1c:00.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-09-07 12:39:43.026976: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 2 with properties:
pciBusID: 0000:1d:00.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-09-07 12:39:43.027633: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 3 with properties:
pciBusID: 0000:1e:00.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-09-07 12:39:43.028818: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2021-09-07 12:39:43.070244: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-09-07 12:39:43.107759: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2021-09-07 12:39:43.161781: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2021-09-07 12:39:43.194023: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2021-09-07 12:39:43.225857: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2021-09-07 12:39:43.278385: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-09-07 12:39:43.283693: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0, 1, 2, 3
INFO:GPUs: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:1', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:2', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:3', device_type='GPU')]
INFO:Q/A pairs: {'smooth-or-featured': ['_smooth', '_featured-or-disk', '_artifact'], 'disk-edge-on': ['_yes', '_no'], 'has-spiral-arms': ['_yes', '_no'], 'bar': ['_strong', '_weak', '_no'], 'bulge-size': ['_dominant', '_large', '_moderate', '_small', '_none'], 'how-rounded': ['_round', '_in-between', '_cigar-shaped'], 'edge-on-bulge': ['_boxy', '_none', '_rounded'], 'spiral-winding': ['_tight', '_medium', '_loose'], 'spiral-arm-count': ['_1', '_2', '_3', '_4', '_more-than-4', '_cant-tell'], 'merging': ['_none', '_minor-disturbance', '_major-disturbance', '_merger']}
INFO:Schema: <zoobot.schemas.Schema object at 0x7fff84222ee0>
INFO:tfrecord.io: Loading dataset from ['/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_30.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_43.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_42.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_34.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_1.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_35.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_0.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_41.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_5.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_7.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_12.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_13.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_17.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_46.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_37.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_10.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_14.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_8.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_2.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_16.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_48.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_47.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_38.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_9.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_45.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_19.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_15.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_20.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_23.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_4.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_11.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_18.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_36.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_39.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_6.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_21.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_3.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_25.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_32.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_27.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_40.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_24.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_22.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_33.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_44.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_26.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_28.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_31.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_29.tfrecord']
WARNING:Loading multiple tfrecords with interleaving, shuffle=True
INFO:Files to load: ['/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_30.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_43.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_42.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_34.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_1.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_35.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_0.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_41.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_5.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_7.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_12.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_13.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_17.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_46.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_37.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_10.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_14.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_8.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_2.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_16.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_48.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_47.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_38.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_9.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_45.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_19.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_15.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_20.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_23.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_4.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_11.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_18.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_36.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_39.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_6.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_21.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_3.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_25.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_32.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_27.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_40.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_24.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_22.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_33.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_44.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_26.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_28.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_31.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/train_shards/s300_shard_29.tfrecord'] (49)
2021-09-07 12:39:44.169196: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA
2021-09-07 12:39:44.186366: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2400000000 Hz
2021-09-07 12:39:44.198084: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x555559f06d00 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-09-07 12:39:44.198115: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2021-09-07 12:39:44.610798: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:1a:00.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-09-07 12:39:44.611550: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 1 with properties:
pciBusID: 0000:1c:00.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-09-07 12:39:44.612189: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 2 with properties:
pciBusID: 0000:1d:00.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-09-07 12:39:44.612807: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 3 with properties:
pciBusID: 0000:1e:00.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-09-07 12:39:44.612861: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2021-09-07 12:39:44.612871: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-09-07 12:39:44.612880: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2021-09-07 12:39:44.612887: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2021-09-07 12:39:44.612899: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2021-09-07 12:39:44.612907: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2021-09-07 12:39:44.612916: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-09-07 12:39:44.617670: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0, 1, 2, 3
2021-09-07 12:39:44.617708: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2021-09-07 12:39:44.621276: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-09-07 12:39:44.621291: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0 1 2 3
2021-09-07 12:39:44.621300: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N Y Y Y
2021-09-07 12:39:44.621303: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 1: Y N Y Y
2021-09-07 12:39:44.621306: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 2: Y Y N Y
2021-09-07 12:39:44.621309: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 3: Y Y Y N
2021-09-07 12:39:44.625164: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 13820 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:1a:00.0, compute capability: 7.0)
2021-09-07 12:39:44.627224: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 14900 MB memory) -> physical GPU (device: 1, name: Tesla V100-SXM2-16GB, pci bus id: 0000:1c:00.0, compute capability: 7.0)
2021-09-07 12:39:44.629158: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 14900 MB memory) -> physical GPU (device: 2, name: Tesla V100-SXM2-16GB, pci bus id: 0000:1d:00.0, compute capability: 7.0)
2021-09-07 12:39:44.631078: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 14900 MB memory) -> physical GPU (device: 3, name: Tesla V100-SXM2-16GB, pci bus id: 0000:1e:00.0, compute capability: 7.0)
2021-09-07 12:39:44.636009: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55555bd30ab0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2021-09-07 12:39:44.636026: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Tesla V100-SXM2-16GB, Compute Capability 7.0
2021-09-07 12:39:44.636031: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (1): Tesla V100-SXM2-16GB, Compute Capability 7.0
2021-09-07 12:39:44.636034: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (2): Tesla V100-SXM2-16GB, Compute Capability 7.0
2021-09-07 12:39:44.636037: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (3): Tesla V100-SXM2-16GB, Compute Capability 7.0
INFO:tfrecord.io: Loading dataset from ['/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/eval_shards/s300_shard_1.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/eval_shards/s300_shard_0.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/eval_shards/s300_shard_5.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/eval_shards/s300_shard_7.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/eval_shards/s300_shard_12.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/eval_shards/s300_shard_10.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/eval_shards/s300_shard_8.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/eval_shards/s300_shard_2.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/eval_shards/s300_shard_9.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/eval_shards/s300_shard_4.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/eval_shards/s300_shard_11.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/eval_shards/s300_shard_6.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/eval_shards/s300_shard_3.tfrecord']
WARNING:Loading multiple tfrecords with interleaving, shuffle=False
INFO:Files to load: ['/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/eval_shards/s300_shard_1.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/eval_shards/s300_shard_0.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/eval_shards/s300_shard_5.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/eval_shards/s300_shard_7.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/eval_shards/s300_shard_12.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/eval_shards/s300_shard_10.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/eval_shards/s300_shard_8.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/eval_shards/s300_shard_2.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/eval_shards/s300_shard_9.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/eval_shards/s300_shard_4.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/eval_shards/s300_shard_11.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/eval_shards/s300_shard_6.tfrecord', '/mnt/lustre3p/groups/ASTR1169/GalaxyZoo/tf_shards/eval_shards/s300_shard_3.tfrecord'] (13)
INFO:Input size 300, crop size 225, final size 224
WARNING:Crop size and final size are similar: skipping resizing and cropping directly to resize_size (ignoring crop_size)
INFO:Building efficientnet to expect input (224, 224, 1), after any preprocessing layers
2021-09-07 12:39:51.949271: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started.
2021-09-07 12:39:51.949340: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1363] Profiler found 4 GPUs
2021-09-07 12:39:52.012240: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcupti.so.10.1
2021-09-07 12:39:52.113527: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1408] function cupti_interface_->Subscribe( &subscriber_, (CUpti_CallbackFunc)ApiCallback, this)failed with error CUPTI_ERROR_INSUFFICIENT_PRIVILEGES
2021-09-07 12:39:52.114476: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1447] function cupti_interface_->ActivityRegisterCallbacks( AllocCuptiActivityBuffer, FreeCuptiActivityBuffer)failed with error CUPTI_ERROR_INSUFFICIENT_PRIVILEGES
2021-09-07 12:39:52.114515: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1430] function cupti_interface_->EnableCallback( 0 , subscriber_, CUPTI_CB_DOMAIN_DRIVER_API, cbid)failed with error CUPTI_ERROR_INVALID_PARAMETER
2021-09-07 12:40:05.655230: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-09-07 12:40:06.374295: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-09-07 12:40:11.470784: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started.
2021-09-07 12:40:11.470883: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1408] function cupti_interface_->Subscribe( &subscriber_, (CUpti_CallbackFunc)ApiCallback, this)failed with error CUPTI_ERROR_NOT_INITIALIZED
2021-09-07 12:40:11.471318: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1447] function cupti_interface_->ActivityRegisterCallbacks( AllocCuptiActivityBuffer, FreeCuptiActivityBuffer)failed with error CUPTI_ERROR_NOT_INITIALIZED
2021-09-07 12:40:11.914939: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1430] function cupti_interface_->EnableCallback( 0 , subscriber_, CUPTI_CB_DOMAIN_DRIVER_API, cbid)failed with error CUPTI_ERROR_INVALID_PARAMETER
2021-09-07 12:40:11.918056: I tensorflow/core/profiler/internal/gpu/device_tracer.cc:216] GpuTracer has collected 0 callback api events and 0 activity events.
2021-09-07 12:40:11.931154: I tensorflow/core/profiler/rpc/client/save_profile.cc:168] Creating directory: /mnt/lustre3p/groups/ASTR1169/GalaxyZoo/results/tensorboard/train/plugins/profile/2021_09_07_12_40_11
2021-09-07 12:40:11.939096: I tensorflow/core/profiler/rpc/client/save_profile.cc:174] Dumped gzipped tool data for trace.json.gz to /mnt/lustre3p/groups/ASTR1169/GalaxyZoo/results/tensorboard/train/plugins/profile/2021_09_07_12_40_11/gpu4001.trace.json.gz
2021-09-07 12:40:11.940985: I tensorflow/core/profiler/utils/event_span.cc:288] Generation of step-events took 0 ms
2021-09-07 12:40:11.950389: I tensorflow/python/profiler/internal/profiler_wrapper.cc:87] Creating directory: /mnt/lustre3p/groups/ASTR1169/GalaxyZoo/results/tensorboard/train/plugins/profile/2021_09_07_12_40_11Dumped tool data for overview_page.pb to /mnt/lustre3p/groups/ASTR1169/GalaxyZoo/results/tensorboard/train/plugins/profile/2021_09_07_12_40_11/gpu4001.overview_page.pb
Dumped tool data for input_pipeline.pb to /mnt/lustre3p/groups/ASTR1169/GalaxyZoo/results/tensorboard/train/plugins/profile/2021_09_07_12_40_11/gpu4001.input_pipeline.pb
Dumped tool data for tensorflow_stats.pb to /mnt/lustre3p/groups/ASTR1169/GalaxyZoo/results/tensorboard/train/plugins/profile/2021_09_07_12_40_11/gpu4001.tensorflow_stats.pb
Dumped tool data for kernel_stats.pb to /mnt/lustre3p/groups/ASTR1169/GalaxyZoo/results/tensorboard/train/plugins/profile/2021_09_07_12_40_11/gpu4001.kernel_stats.pb
INFO:All epochs completed - finishing gracefully
INFO:Loading and returning (best) model