What is a promise in Javascript?

Question

Rodrigo Cortés

Asked: 2022-04-16 17:59:58 +0800 CST 2022-04-16 17:59:58 +0800 CST 2022-04-16 17:59:58 +0800 CST

Why does pandas modify the data type of the columns of my DataFrame?

772

Hi, I'm new to Python and I can't fix this error, let me explain;

I have this code:

import csv
import pandas as pd
import numpy as np

encabezado=['RFC','EMP','COMPROBANTE','TIPO','CPT','IMPORTE','ANIO','QNA','PTDA','C1','C2','PRDNAME'
            ,'COL','C3']


file=pd.read_csv('TRA.csv',low_memory=False,sep=",",names=encabezado)
df = pd.DataFrame(file)
conceptos= df['TIPO'].map(str)+df['CPT'].map(str)
df.loc[:,'COLUMNAS'] = conceptos;
print(df)
df.to_csv('TRA1321_2.csv',sep=',')

My header variable contains the name of the columns of my DataFrame, columns that I later try to concatenate and write in a new file, the detail is that it does not respect the zeros that some data have at the beginning and even that, converts to float the integer values, I show them:

My original csv file:

SARS751009J27,2000003369,701457548,1,37,   6299.99,2021,13,TP,,,PRDE130,000001,
SARS751009J27,2000003369,701457548,2,01,    1430.7,2021,13,TP,,,PRDE130,000001,
OEGC8105169P5,2000503934,701457549,1,30,     558.4,2021,13,BR,,,PRDE130,000002,
OEGC8105169P5,2000503934,701457549,2,01,    119.26,2021,13,00,,,PRDE130,000002,

The file that this script generates:

,RFC,EMP,COMPROBANTE,TIPO,CPT,IMPORTE,ANIO,QNA,PTDA,C1,C2,PRDNAME,COL,C3,COLUMNAS
0,SARS751009J27,2000003369.0,701457548.0,1.0,37,6299.99,2021.0,13.0,TP,,,PRDE130,1.0,,1.037
1,SARS751009J27,2000003369.0,701457548.0,2.0,01,1430.7,2021.0,13.0,TP,,,PRDE130,1.0,,2.001
2,OEGC8105169P5,2000503934.0,701457549.0,1.0,30,558.4,2021.0,13.0,BR,,,PRDE130,2.0,,1.030
3,OEGC8105169P5,2000503934.0,701457549.0,2.0,01,119.26,2021.0,13.0,00,,,PRDE130,2.0,,2.001

Plus it numbers my columns and loops through everything.

Can you help me to solve it? I can't find how to do it, thanks.

1 Answers

Voted

Rodrigo Cortés · Answer 1 · 2022-04-16T22:22:29+08:00

I solved my problem, the detail was that I did not control the empty or NaN values, so somehow the values were changed.

I realized this because I removed the empty columns manually and the file was uploaded correctly.

So I read the pandas documentation and found the following which I roughly translate here:

‎keep_default_na‎‎bool, true default ‎ ‎Whether or not to include default NaN values when parsing the data. Depending on whether ‎‎na_values is passed, the behavior is as follows:‎

‎If ‎‎keep_default_na‎‎ is True and ‎‎na_values is specified,‎‎ ‎‎na_values‎‎ is appended to the default NaN values used for analysis.

‎ If ‎‎keep_default_na‎‎ is True and no ‎‎na_values are specified,‎‎ only the default NaN values are used for analysis.

‎If ‎‎keep_default_na‎‎ is False and ‎‎na_values are specified,‎‎ only the specified NaN values ‎‎na_values‎‎ are used for analysis.

‎ If ‎‎keep_default_na‎‎ is False and no ‎‎na_values are specified,‎‎ no string will be parsed as NaN.

Here I leave the link so you can give it a read if you like:

pandas.read_csv

So all I did was modify and/or add this line to take NaNs into account when parsing the data:

file=pd.read_csv('TRA.csv',low_memory=False,sep=",",names=encabezado,keep_default_na=False)

Why does pandas modify the data type of the columns of my DataFrame?

HTML button that sends you to another page

Why do I get the error "Call to undefined function mysql_connect()"?

How to create an HTML button that works as a link?

How to separate a String in Java. How to use split()

Filter by dates in sql server

How to limit the number of decimal places in a double?

For each in JavaScript?

Position footer ALWAYS glued to the footer

Definitive Guide to Type Conversion in Java

How to properly compare Strings (and objects) in Java?