H5W3
当前位置:H5W3 > 其他技术问题 > 正文

Tensorflow使用inception_resnet_v2预训练网络分类出现一些问题

1问题描述

这次实战是利用slim框架里面的代码,想利用inception_resnet_v2的预训练网络去训练自己的数据集进行分类。但是出现了

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [764] rhs shape= [1001]
     [[Node: save/Assign_8 = Assign[T=DT_FLOAT, _class=["loc:@InceptionResnetV2/AuxLogits/Logits/biases"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](InceptionResnetV2/AuxLogits/Logits/biases, save/RestoreV2_8)]]

这种问题大概就是说我把1001element放入到element只有764里面导致报错

2

自己通过搜索,发现也有人遇到相似问题,删掉之前训练过的checkpoint数据就可以。可是我是在tinymind计算(相当于云计算)的,应该不存在有先前训练过留下的数据。
自己尝试了改了下slim框架代码也没成功(可能没改对)。

相关代码

// 请把代码文本粘贴到下方(请勿用图片代替代码)

Caused by op 'save/Assign_8', defined at:
  File "./train_image_classifier.py", line 581, in 
    tf.app.run()
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 124, in run
    _sys.exit(main(argv))
  File "./train_image_classifier.py", line 571, in main
    init_fn=_get_init_fn(),
  File "./train_image_classifier.py", line 369, in _get_init_fn
    ignore_missing_vars=FLAGS.ignore_missing_vars)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/variables.py", line 688, in assign_from_checkpoint_fn
    saver = tf_saver.Saver(var_list, reshape=reshape_variables)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1239, in __init__
    self.build()
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1248, in build
    self._build(self._filename, build_save=True, build_restore=True)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1284, in _build
    build_save=build_save, build_restore=build_restore)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 765, in _build_internal
    restore_sequentially, reshape)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 440, in _AddRestoreOps
    assign_ops.append(saveable.restore(tensors, shapes))
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 160, in restore
    self.op.get_shape().is_fully_defined())
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/ops/state_ops.py", line 276, in assign
    validate_shape=validate_shape)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/ops/gen_state_ops.py", line 59, in assign
    use_locking=use_locking, name=name)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3160, in create_op
    op_def=op_def)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1625, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

3

有没有大神遇到过这种情况(不是旧checkpoint数据导致的),小弟在此谢谢了。
slim框架
https://github.com/tensorflow…

回答:

还是没搞定。感觉要放弃这个模型

回答:

我删掉train_dir里边的checkpoint之后就可以训练了。我是用的inceptionv3

本文地址:H5W3 » Tensorflow使用inception_resnet_v2预训练网络分类出现一些问题

评论 0

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址