ó ‚Y[c@ sJddlmZddlmZddlZddlZddlZddlZddlZdZeed„Z de fd„ƒYZ dd „Z ed „Z ed d d „Zd„ZedkrFejdZejdZdeZedeZgejeƒD]<Zejjejjeeƒƒrejjeeƒ^qZgeD]ZeekrQe^qQZgeD]Zdekrve^qvZeeƒZeeƒdedZ e e ƒZ!dZ"ddddgZ#dZ$gZ%e e"e#ej&ƒƒZ'e e$e%ej&ƒƒZ(e)deƒZ*e*dZ+ee!e'e(ge+ƒe e!e"ƒZ,e e!e$ƒZ-xWe.de*ƒD]FZ/ee/ƒej0edede1e/ƒdƒZ2e e,e2ƒquWdede2j3dZ4edkrej5e4fdej6ƒZ7n*ed kr,ej8e4fdej6ƒZ7ne e-e7ƒe!j9ƒndS(!iÿÿÿÿ(tdivision(tprint_functionNt defaultNodecC sCddl}|j|ddddƒ}|j|j||ƒ}|S(s- toDiskName: the name of the file on disk iÿÿÿÿNtmodetwttitletDataset(ttablestopenFilet createGrouptroot(t toDiskNamet groupNametgroupDescriptionRth5filetgcolumns((sm/cstor/stanford/rbaltman/users/wtorng/DEEP_LEARNING/3DCNN/FSCNN_models/Detect_ATP_sites/code/store_pytable.pyt init_h5_file s tInfoToInitArrayOnH5FilecB seZd„ZRS(cC s||_||_||_dS(sã name: the name of this matrix shape: tuple indicating the shape of the matrix (similar to numpy shapes) atomicType: one of the pytables atomic types - eg: tables.Float32Atom() or tables.StringAtom(itemsize=length); N(tnametshapet atomicType(tselfRRR((sm/cstor/stanford/rbaltman/users/wtorng/DEEP_LEARNING/3DCNN/FSCNN_models/Detect_ATP_sites/code/store_pytable.pyt__init__s  (t__name__t __module__R(((sm/cstor/stanford/rbaltman/users/wtorng/DEEP_LEARNING/3DCNN/FSCNN_models/Detect_ATP_sites/code/store_pytable.pyRsiˆcC s‰t|ƒ}t|t|ƒƒ|}x\td||ƒD]H}||krY|||n||}|j|||!ƒtjƒq9WdS(s4 Going to write to disk in batches of batch_size iN(tlentinttfloattxrangetappendRtflush(t theH5Columnt whatToWritet batch_sizet data_sizetlasttitstop((sm/cstor/stanford/rbaltman/users/wtorng/DEEP_LEARNING/3DCNN/FSCNN_models/Detect_ATP_sites/code/store_pytable.pyt writeToDisk!s  cC s|jdtƒ}t||ƒS(Nt/(tgetNodetDEFAULT_NODE_NAMEtgetattr(Rt columnNametnodeNametnode((sm/cstor/stanford/rbaltman/users/wtorng/DEEP_LEARNING/3DCNN/FSCNN_models/Detect_ATP_sites/code/store_pytable.pyt getH5column-stbloscic C s’|j|j|ƒ}tjd|d|ƒ}x^|D]V}dg} | j|jƒ|j||jd|jd| d|jd|d|ƒq4Wd S( sÌ h5file: filehandle to the h5file, initialised with init_h5_file infoToInitArrayOnH5File: array of instances of InfoToInitArrayOnH5File expectedRows: this code is set up to work with EArrays, which can be extended after creation. (presumably, if your data is too big to fit in memory, you're going to have to use EArrays to write it in pieces). "sizeEstimate" is the estimated size of the final array; it is used by the compression algorithm and can have a significant impace on performance. nodeName: the name of the node being written to. complib: the docs seem to recommend blosc for compression... complevel: compression level. Not really sure how much of a difference this number makes... tcomplibt complevelitatomRRtfilterst expectedrowsN( R(R RtFilterstextendRt createEArrayRR( RtinfoToInitArraysOnH5Filet expectedRowsR,R0R1RR3tinfoToInitArrayOnH5Filet finalShape((sm/cstor/stanford/rbaltman/users/wtorng/DEEP_LEARNING/3DCNN/FSCNN_models/Detect_ATP_sites/code/store_pytable.pytinitColumnsOnH5File2s   cC sNddl}|jj||ƒ}|j||ƒt|jddgƒƒdS(Niÿÿÿÿg@(tsklearn.linear_modelt linear_modeltLinearRegressiontfittprinttpredict(t predictorstoutcomestsklearntmodel((sm/cstor/stanford/rbaltman/users/wtorng/DEEP_LEARNING/3DCNN/FSCNN_models/Detect_ATP_sites/code/store_pytable.pytperformScikitFitGs t__main__iis../site_atp_numpy/t_s.dats../pytables_atp/s .pytablestdataiitlabelièiR'tnegtdtypetpos(:t __future__RRRtnumpytsystostrandomR)RtobjectRR&R.R<RGRtargvt train_or_testt pos_or_negt input_dirtIDtlistdirtftpathtisfiletjointfilesttRt total_numRAtfilename_trainRtdataNamet dataShapet labelNamet labelShapet Float32AtomtdataInfot labelInfotmint num_of_datt numSamplest dataColumnt labelColumntrangetdat_numtloadtstrtXRt actual_sizetzerostfloat32tytonestclose(((sm/cstor/stanford/rbaltman/users/wtorng/DEEP_LEARNING/3DCNN/FSCNN_models/Detect_ATP_sites/code/store_pytable.pyts^            R%%     )