pickle - Frequently Updating Stored Data for a Numerical Experiment using Python -
i running numerical experiment requires many iterations. after each iteration, store data in pickle file or pickle-like file in case program times-out or data structure becomes tapped. best way proceed. here skeleton code:
data_dict = {} # maybe dictionary not best choice j in parameters: # j = (alpha, beta, gamma) , cycle through k in number_of_experiments: # lots of experiments (10^4) file = open('storage.pkl', 'ab') data = experiment() # experiment returns numerical value # experiment takes ~ 1 seconds, increase # parameters scale data_dict.setdefault(j, []).append(data) pickle.dump(data_dict, file) file.close()
questions:
- is shelve better choice here? or other python library not aware?
- i using data dict because it's easier code , more flexible if need change things more experiments. huge advantage use pre-allocated array?
- does opening , closing files affect run time? can check on progress in addition text logs have set up.
thank help!
- assuming using
numpy
numerical experiments, instead of pickle suggest using numpy.savez. - keep simple , make optimizations if feel script runs long.
- opening , closing files affect run time, having backup anyway better.
and use collections.defaultdict(list)
instead of plain dict
, setdefault
.
Comments
Post a Comment