What is a promise in Javascript?

Question

Asked: 2022-07-02 19:17:53 +0800 CST 2022-07-02 19:17:53 +0800 CST 2022-07-02 19:17:53 +0800 CST

String Column Optimization

772

Let me tell you, I have the following column called "ASSET" is of type string and has 10 or 11 characters:

And with this column I generate another called "CLASS" with the following code:

join["CLASE"] = ""
for row in join["ASSET"]:
    join["CLASE"] = [row[0:3] if len(row)==11 else row[0:4] for row in join["ASSET"]]

In essence the code works and does what it is supposed to do, which is the following:si el registro tiene 11 caracteres, me trae los primeros 3 caracteres, y en caso de que tenga 12 caracteres, me trae los primeros 4 caracteres.

However, it takes a long time to do it, I don't know if the list comprehension for the size of data that I'm handling (approx. 156,000 records) is too much or what's wrong. That's why I was looking for your help to know if you can think of a way that I can do the same thing that this code snippet does, but more efficiently. Currently it takes about an hour to run just that cell.

1 Answers

Voted

HeytalePazguato · Answer 1 · 2022-07-02T19:51:35+08:00

Good day,

The problem is that you have a loop for, then a list comprehension and then aif

You can do it directly from the value of your column with the methodpandas.Series.str

Since you didn't put your data as text create a generic example and you will have to adapt the column names.

The data that I am using in the "sample2.csv" file is this:

asset
10012345678
20012345678
30012345678
40012345678
50012345678
60012345678
70012345678
80012345678
90012345678
1012345678
2012345678
3012345678
4012345678
5012345678
6012345678
7012345678
8012345678
9012345678
0112345678
0212345678
0312345678
0412345678
0512345678
0612345678
0712345678
0812345678
0912345678
1112345678
12123456789

In your example you mention that if the stringis 10 digits you want to get the first 3 and if it is 11, get the first 4 and so you use the if, but in other words, what you want is to take the characters from the beginning of stringthe to the character "-7"

Everything you have can be reduced to one line:

import pandas as pd

df = pd.read_csv('sample2.csv', dtype=str)

df['clase'] = df['asset'].str[:-7]

This returns:

    asset       clase
0   10012345678 1001
1   20012345678 2001
2   30012345678 3001
3   40012345678 4001
4   50012345678 5001
5   60012345678 6001
6   70012345678 7001
7   80012345678 8001
8   90012345678 9001
9   1012345678  101
10  2012345678  201
11  3012345678  301
12  4012345678  401
13  5012345678  501
14  6012345678  601
15  7012345678  701
16  8012345678  801
17  9012345678  901
18  0112345678  011
19  0212345678  021
20  0312345678  031
21  0412345678  041
22  0512345678  051
23  0612345678  061
24  0712345678  071
25  0812345678  081
26  0912345678  091
27  1112345678  111
28  12123456789 1212

String Column Optimization

HTML button that sends you to another page

Why do I get the error "Call to undefined function mysql_connect()"?

How to create an HTML button that works as a link?

How to separate a String in Java. How to use split()

Filter by dates in sql server

How to limit the number of decimal places in a double?

For each in JavaScript?

Position footer ALWAYS glued to the footer

Definitive Guide to Type Conversion in Java

How to properly compare Strings (and objects) in Java?