arrays - Python -- Algorithm efficiency and stability -


i have done 2 algorithms , want check 1 of them more 'efficient' , uses less memory. first 1 creates numpy array , modifies array. second 1 creates python empty array , pushes values array. who's better? first program:

 f = open('/users/marcortiz/documents/vlex/pylearn2/mlearning/classify/files/models/model_training.txt')         lines = f.readlines()         f.close()         zeros = np.zeros((60343,4917))          l in lines:             row = l.split(",")             element in row:                 zeros[lines.index(l), row.index(element)] = element          x = zeros[1,:]         y = zeros[:,0]         one_hot = np.ones((counter, 2)) 

the second one:

 f = open('/users/marcortiz/documents/vlex/pylearn2/mlearning/classify/files/models/model_training.txt')         lines = f.readlines()         f.close()         x = []         y = []          l in lines:             row = l.split(",")             x.append([float(elem) elem in row[1:]])             y.append(float(row[0]))          x = np.array(x)         y = np.array(y)         one_hot = np.ones((counter, 2)) 

my theory first 1 slower uses less memory , it's more 'stable' while working large files. second 1 it's faster uses lot of memory , not stable while working large files (543mb, 70,000 lines)

thanks!

the problem both codes you're loading whole file in memory first using file.readlines(), should iterate on file object directly 1 line @ time.

from itertools import izip #generator function def func():    open('filename.txt') f:        line in f:           row = map(float, l.split(","))           yield row[1:], row[0]  x, y = izip(*func()) x = np.array(x) y = np.array(y) ... 

i sure pure numpy solution going faster this.


Comments

Popular posts from this blog

c++ - Creating new partition disk winapi -

Android Prevent Bluetooth Pairing Dialog -

VBA function to include CDATA -