當(dāng)前位置：首頁 > 编程语言 > python >内容正文

python

执行python generate_tfrecord.py 出现 utf-8‘ codec can‘t decode

發(fā)布時間：2024/3/13 python 40 豆豆

生活随笔收集整理的這篇文章主要介紹了执行python generate_tfrecord.py 出现 utf-8‘ codec can‘t decode 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

將 cvs轉(zhuǎn)換為tfrecord ，終端輸入指令：

python generate_tfrecord.py --csv_input=data/cup_train.csv --output_path=data/cup_train.record

?其中python文件中的csv_input相當(dāng)于data/cup_train.csv ，output_path相當(dāng)于data/cup_train.record，也就是輸出文件路徑以及名稱。??

執(zhí)行下面code：

# -*- coding: utf-8 -*- """ Created on Tue Jan 16 01:04:55 2018@author: Xiang Guo """""" Usage:# From tensorflow/models/# Create train data:python generate_tfrecord.py --csv_input=data/tv_vehicle_labels.csv --output_path=train.record# Create test data:python generate_tfrecord.py --csv_input=data/test_labels.csv --output_path=test.record """import os import io import pandas as pd import tensorflow as tffrom PIL import Image from object_detection.utils import dataset_util from collections import namedtuple, OrderedDictos.chdir('D:\\tensorflow-model\\models\\research\\object_detection\\')flags = tf.app.flags flags.DEFINE_string('csv_input', '', 'Path to the CSV input') flags.DEFINE_string('output_path', '', 'Path to output TFRecord') FLAGS = flags.FLAGS# TO-DO replace this with label map def class_text_to_int(row_label):if row_label == 'tv':return 1elif row_label == 'vehicle':return 2else:Nonedef split(df, group):data = namedtuple('data', ['filename', 'object'])gb = df.groupby(group)return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]def create_tf_example(group, path):with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:encoded_jpg = fid.read()encoded_jpg_io = io.BytesIO(encoded_jpg)image = Image.open(encoded_jpg_io)width, height = image.sizefilename = group.filename.encode('utf8')image_format = b'jpg'xmins = []xmaxs = []ymins = []ymaxs = []classes_text = []classes = []for index, row in group.object.iterrows():xmins.append(row['xmin'] / width)xmaxs.append(row['xmax'] / width)ymins.append(row['ymin'] / height)ymaxs.append(row['ymax'] / height)classes_text.append(row['class'].encode('utf8'))classes.append(class_text_to_int(row['class']))tf_example = tf.train.Example(features=tf.train.Features(feature={'image/height': dataset_util.int64_feature(height),'image/width': dataset_util.int64_feature(width),'image/filename': dataset_util.bytes_feature(filename),'image/source_id': dataset_util.bytes_feature(filename),'image/encoded': dataset_util.bytes_feature(encoded_jpg),'image/format': dataset_util.bytes_feature(image_format),'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),'image/object/class/text': dataset_util.bytes_list_feature(classes_text),'image/object/class/label': dataset_util.int64_list_feature(classes),}))return tf_exampledef main(_):writer = tf.python_io.TFRecordWriter(FLAGS.output_path)path = os.path.join(os.getcwd(), 'images')examples = pd.read_csv(FLAGS.csv_input)grouped = split(examples, 'filename')for group in grouped:tf_example = create_tf_example(group, path)writer.write(tf_example.SerializeToString())writer.close()output_path = os.path.join(os.getcwd(), FLAGS.output_path)print('Successfully created the TFRecords: {}'.format(output_path))if __name__ == '__main__':tf.app.run()

遇到以下幾個問題：

1. 錯誤提示：AttributeError: module 'tensorflow' has no attribute 'app'

分析問題：由于tensorflow版本問題導(dǎo)致的

解決方式：將導(dǎo)入code

import tensorflow as tf

修改為：

import tensorflow.compat.v1 as tf tf.disable_v2_behavior()

2.錯誤提示：?File "C:\Program Files\python\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 84, in _preread_check
? ? self._read_buf = _pywrap_file_io.BufferedInputStream(
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 112: invalid continuation byte

分析問題：通過跟蹤打印，發(fā)現(xiàn)path 問題導(dǎo)致的

解決方式：將def main(_): 中? path = os.path.join(os.getcwd(), 'images') 這句中的image 修改為存放cup_train.csv 以及 jpg 和xml 的文件夾名稱（ path = os.path.join(os.getcwd(), 'data')）。

3.錯誤提示：tf.python_io.TFRecordWriter??UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 112: invalid continuation byte?

分析問題：直接在pycharm? 或者別的python 管理app中執(zhí)行g(shù)enerate_tfrecord.py ，導(dǎo)致writer = tf.python_io.TFRecordWriter(FLAGS.output_path) 中的FLAGS.output_path 沒有傳遞文件名

解決問題：可以直接在writer = tf.python_io.TFRecordWriter('D:\\models-master\\research\\object_detection\\data\\cup_train.record')?可以直接給出文件輸出路徑與名稱；

修正后的code：

#-*- coding : utf-8 -*- """ Created on Tue Jan 16 01:04:55 2018@author: Xiang Guo """""" Usage:# From tensorflow/models/# Create train data:python generate_tfrecord.py --csv_input=data/tv_vehicle_labels.csv --output_path=train.record# Create test data:python generate_tfrecord.py --csv_input=data/test_labels.csv --output_path=test.record """import os import io import pandas as pd #import tensorflow as tf import tensorflow.compat.v1 as tffrom PIL import Image from object_detection.utils import dataset_util from collections import namedtuple, OrderedDictos.chdir('D:\\models-master\\research\\object_detection\\')flags = tf.app.flags flags.DEFINE_string('csv_input', '', 'Path to the CSV input') flags.DEFINE_string('output_path', '', 'Path to output TFRecord') FLAGS = flags.FLAGS# TO-DO replace this with label map def class_text_to_int(row_label):if row_label == 'cup':return 1else:Nonedef split(df, group):data = namedtuple('data', ['filename', 'object'])gb = df.groupby(group)return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]def create_tf_example(group, path):with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:encoded_jpg = fid.read()encoded_jpg_io = io.BytesIO(encoded_jpg)image = Image.open(encoded_jpg_io)width, height = image.sizefilename = group.filename.encode('utf8')image_format = b'jpg'xmins = []xmaxs = []ymins = []ymaxs = []classes_text = []classes = []for index, row in group.object.iterrows():xmins.append(row['xmin'] / width)xmaxs.append(row['xmax'] / width)ymins.append(row['ymin'] / height)ymaxs.append(row['ymax'] / height)classes_text.append(row['class'].encode('utf8'))classes.append(class_text_to_int(row['class']))tf_example = tf.train.Example(features=tf.train.Features(feature={'image/height': dataset_util.int64_feature(height),'image/width': dataset_util.int64_feature(width),'image/filename': dataset_util.bytes_feature(filename),'image/source_id': dataset_util.bytes_feature(filename),'image/encoded': dataset_util.bytes_feature(encoded_jpg),'image/format': dataset_util.bytes_feature(image_format),'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),'image/object/class/text': dataset_util.bytes_list_feature(classes_text),'image/object/class/label': dataset_util.int64_list_feature(classes),}))return tf_exampledef main(_):writer = tf.python_io.TFRecordWriter(FLAGS.output_path)path = os.path.join(os.getcwd(), 'data')examples = pd.read_csv(FLAGS.csv_input,encoding="unicode_escape")grouped = split(examples, 'filename')for group in grouped:tf_example = create_tf_example(group, path)writer.write(tf_example.SerializeToString())writer.close()output_path = os.path.join(os.getcwd(), FLAGS.output_path)print('Successfully created the TFRecords: {}'.format(output_path))if __name__ == '__main__':tf.app.run()

其中：path = os.path.join(os.getcwd(), 'data') 中的data就是終端執(zhí)行時傳遞的文件名稱

總結(jié)

以上是生活随笔為你收集整理的执行python generate_tfrecord.py 出现 utf-8‘ codec can‘t decode的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇： java开发操作系统：启动任意多控制台窗
下一篇： PAT甲级1033

3atv精品不卡视频,97人人超碰国产精品最新,中文字幕av一区二区三区人妻少妇,久久久精品波多野结衣,日韩一区二区三区精品

python

执行python generate_tfrecord.py 出现 utf-8‘ codec can‘t decode

總結(jié)