I download a CSV (unicode) file from a website, the fields are separated by tabs and between double quotes and the line break does not make it at the end of each row as seen in the CSV, example of how the information is seen of the CSV when opening it with notepad:
"campo1" "campo2" "campo3" "campo4"
"campo5" "campo6" "000000AA1011" "000000AA1012"
"000000AA1013" "000000AA1014" "000000AA1015" "000000AA1016"
"000000AA1017" "000000AA1018" "000000AA1019" "000000AA1020"
"000000AA1021" "000000AA1022"
I want to convert it to CSV(utf-8) so that the fields are separated by commas and are not between double quotes and the line break is done as seen in the CSV and in this way I can do the LOAD DATA LOCAL INFILE
from MySQL
and insert the CSV records without any problem to the table MySQL
, example of how I want the CSV to look when opened with a notepad:
campo1,campo2,campo3,campo4,campo5,campo6
000000AA1011,000000AA1012,000000AA1013,000000AA1014,000000AA1015,000000AA1016
000000AA1017,000000AA1018,000000AA1019,000000AA1020,000000AA1021,000000AA1022
It is worth mentioning that in the 1st instance try to take the CSV(unicode) and insert it into the table MySQL
as follows:
LOAD DATA LOCAL INFILE 'C:/data.csv' INTO TABLE bd.down_data
FIELDS TERMINATED BY '\t' ESCAPED BY '"'
LINES TERMINATED BY '\n' IGNORE 1 LINES
(campo1, campo2, campo3);
This query inserts the data but puts a space after each character, example:
campo1 campo2 campo3
0 0 0 0 0 0 A A 1 0 1 1 0 0 0 0 0 0 A A 1 0 1 2 0 0 0 0 0 0 A A 1 0 1 3
0 0 0 0 0 0 A A 1 0 1 7 0 0 0 0 0 0 A A 1 0 1 8 0 0 0 0 0 0 A A 1 0 1 9
and try to remove the spaces as follows:
Select REPLACE(campo1,' ','') as campo1 from bd.down_data;
but it does not make any changes, I imagine that the problem is the encoding of the original CSV file and therefore I want to correct the problem from the CSV (have the fields separated by commas and without double quotes when doing the LOAD DATA LOCAL INFILE
) since if I save again the same CSV from the excel as CSV replacing it and I open it with the notepad it already appears separated by commas.
I am doing this in Python
:
import csv
path = 'data.csv'
with open(path, 'r', encoding='utf-8', errors='ignore') as infile, open('final.csv', 'w') as outfile:
inputs = csv.reader(infile)
output = csv.writer(outfile)
for index, row in enumerate(inputs):
if index == 0:
continue
output.writerow(row)
I create the new file final.csv
but without any record and I get the following error:
Traceback (most recent call last):
File "export.py", line 9, in <module>
for index, row in enumerate(inputs):
_csv.Error: line contains NULL byte
UPDATE:
actual file:
https://drive.google.com/open?id=1LnCJVVc83HPAXAc88BMtKoHFIHpP5xtW