PROSITE module - README

data/ --- datasets for training and evaluating the 10 PROSITE functional families 

data_process/ --- Generate training datasets:
 
	cut_box_site_noNMR_std_sulfur.py 
	cut_box_site_fn_curated.py
	cut_box_site_fp_curated.py
	store_pytable.py 
	
models/ --- Definition and model training:

	Voxel_3DCNN.py
	Voxel_SVM.py
	FEATURE_SVM.py
	FEATURE_1DCNN.py
	layers.py
	utils.py

evaluate/ --- After Training:

	Step (1): PROSITE true positive and true negative dataset: Evaluate probability scores of each test site using the corresponding test fold model from 5 fold cross-validation 

	(Only 3DCNN and 1DCNN have scripts in this step. SVM models directly evaluated and saved probability estimates of the test examples in Voxel_SVM.py and FEATURE_SVM.py)

	eval_tp_tn_3DCNN.py
	eval_tp_tn_1DCNN.py
	
	Step (2): Using generated probability scores, evaluate precision and recalls of individule site at user specified probabilty threshold. Find probability threshold that results in 0.99 precision level. 

	PR_tp_tn_3DCNN.py
	PR_tp_tn_1DCNN.py
	PR_tp_tn_FEATURE_SVM.py
	PR_tp_tn_Voxel_SVM.py

	Step (3): Summarize means and standard deviations of precision and recall values at desired probailiy thresholds determined from Step (2) for all functional site.

	summarize_tp_fn_fold_PR_SD_CNN.py
	summarize_tp_fn_fold_PR_SD_SVM.py

	Step (4) - Evaluate probability scores of each PROSTIE false negative and PROSITE positive negative site using trained five fold models  

	eval_fn_1DCNN.py 
	eval_fn_3DCNN.py
	eval_fn_FEATURE_SVM.py 
	eval_fn_Voxel_SVM.py 
	eval_fp_1DCNN.py 
	eval_fp_3DCNN.py
	eval_fp_FEATURE_SVM.py 
	eval_fp_Voxel_SVM.py 

	Step (5) - Using generated probability scores, evaluate performance of each functional family at threshold determined from Step (2)

	PR_fn_1DCNN.py
	PR_fn_3DCNN.py
	PR_fn_FEATURE_SVM.py 
	PR_fn_Voxel_SVM.py 
	PR_fp_1DCNN.py
	PR_fp_3DCNN.py
	PR_fp_FEATURE_SVM.py
	PR_fp_Voxel_SVM.py

results/ --- Prediction probability scores of each site and the trained model weights for the 10 functional sites

- results/weights stores trained weights of the 10 functional site models for each method

- results/prob_score contains predicted probability scores generated from Step(1) 
- results/FN_prob contains predicted probability scores generated from Step(4) 
- results/FP_prob contains predicted probability scores generated from Step(4) 

- results/TP_TN_results contains summary files generated from Step (3)
- results/FN_results contains summary files generated from Step (5)
- results/FP_results contains summary files generated from Step (5)