What is a promise in Javascript?

Question

jsdnlb

Asked: 2020-12-15 06:05:53 +0800 CST 2020-12-15 06:05:53 +0800 CST 2020-12-15 06:05:53 +0800 CST

How can I define multiple segments of a numpy array based on start/end index pairs without iterating?

772

I have two arrays representing the start and length indices (or it could be start and end) in this array that identify sequences of integers that I need to process. The sequences are of variable length.

x=numpy.array([2,3,5,7,9,12,15,21,27,101, 250]) # Puede tener una longitud de millones
starts=numpy.array([2,7]) # Puede tener longitudes de miles
ends=numpy.array([5,9])

# required output is x[2:5],x[7:9] in flat 1D array 
# [5,7,9,12,21,27,101]

I can easily do this with for loops, but the app is performance sensitive so I'm looking for a way to do it without the python iterator.

Any help will be gratefully received!

1 Answers

Voted

jsdnlb · Answer 1 · 2020-02-08T11:23:00+08:00

Proposal 1

A vectorized approach would be the masking created with streaming:

In [16]: r = np.arange(len(x))

In [18]: x[((r>=starts[:,None]) & (r<ends[:,None])).any(0)]
Out[18]: array([ 5,  7,  9, 21, 27])

Proposal #2

Another vectorized way would be to create ramps of 1s and 0s with cumsum (would be better with lots of start and end pairs), like so:

idx = np.zeros(len(x),dtype=int)
idx[starts] = 1
idx[ends[ends<len(x)]] = -1
out = x[idx.cumsum().astype(bool)]

Proposal #3

Another loop-based one for memory efficiency might be better with lots of entries in start-end pairs:

mask = np.zeros(len(x),dtype=bool)
for (i,j) in zip(starts,ends):
    mask[i:j] = True
out = x[mask]

Proposal #4

For completeness here is another one with a loop to select slices and then assign them to an initialized array and it should be good at slices to be selected from a large array

lens = ends-starts
out = np.empty(lens.sum(),dtype=x.dtype)
start = 0
for (i,j,l) in zip(starts,ends,lens):
    out[start:start+l] = x[i:j]
    start += l

If the iterations are many, there is a minor optimization possible to reduce the computation per iteration:

lens = ends-starts
lims = np.r_[0,lens].cumsum()
out = np.empty(lims[-1],dtype=x.dtype)
for (i,j,s,t) in zip(starts,ends,lims[:-1],lims[1:]):
    out[s:t] = x[i:j]

How can I define multiple segments of a numpy array based on start/end index pairs without iterating?

HTML button that sends you to another page

Why do I get the error "Call to undefined function mysql_connect()"?

How to create an HTML button that works as a link?

How to separate a String in Java. How to use split()

Filter by dates in sql server

How to limit the number of decimal places in a double?

For each in JavaScript?

Position footer ALWAYS glued to the footer

Definitive Guide to Type Conversion in Java

How to properly compare Strings (and objects) in Java?