Tuesday, April 25, 2017

Python UnicodeDecodeError



When you are reading CSV file (spreadsheet), make sure you set encoding="utf-8", or you might get:

UnicodeDecodeError: 'utf-8' codec can't decode byte
(result, consumed) = self._buffer_decode(data, self.errors, final)



def read_rows(CSV_file_path: str):
 
    import sys
    print("default encoding", sys.getdefaultencoding())
    import csv
    rows = []
    with open(CSV_file_path, 'rt', encoding="utf-8") as csvfile:
        cvs_reader = csv.reader(csvfile, delimiter=',', quotechar='"')
        for row in cvs_reader: # row is a list
            rows.append(row)
    return rows