- Create a Conda environment:
conda create --name struct2graph python=3.8.19
- Install Pytorch:
pip install pytorch==1.4.0
- Install Scikit-learn:
pip install scikit-learn==1.3.2
- Create the list of proteins: if you are going to use your own dataset, then you must create your own TXT file with the list of proteins, in the format:
UniProt ID PDB ID chain ID, example:
A0FGR8 4P42 A
A4UGR9 4F14 B
A5JGM8 5LR0 A
*** You must have the PDB files of each protein with their corresponding chains.
- Run the script parse_entries.py: it will create a folder called input1 that contains:
proteins_X_Y.npy → numerical fingerprints per residue
adjacencies_X_Y.npy → residue adjacency matrices
names_X_Y.npy → PDB IDs
seqs_X_Y.npy → amino acid sequences
fingerprint_dict.pickle → dictionary of fingerprint encodings.
- Run create_examples.py
- k-fold-CV.py
conda create --name struct2graph python=3.8.19pip install pytorch==1.4.0pip install scikit-learn==1.3.2UniProt ID PDB ID chain ID, example:
A0FGR8 4P42 A
A4UGR9 4F14 B
A5JGM8 5LR0 A
*** You must have the PDB files of each protein with their corresponding chains.