Code to reproduce the results of the paper, "Horizontal Gene Transfer Inference: Gene presence-absence outperforms gene trees" where we perform a comparative study of HGT inference methods.
Code to reproduce the results of the paper, "Horizontal Gene Transfer Inference: Gene presence-absence outperforms gene trees" where we perform a comparative study of HGT inference methods. See: [preprint in bioRxiv](https://www.biorxiv.org/content/10.1101/2024.12.27.630302).
Each folder for steps 01, 02, and 03, contain Jupyter notebooks that were run in the order of their numbering. Scripts and other helper functions that they depend on or are mentioned in the notebooks can be found in the `src/` or `lib/` directories respectively.
Each folder for steps 01, 02, and 03, contain Jupyter notebooks that were run in the order of their numbering. Scripts and other helper functions that they depend on or are mentioned in the notebooks can be found in the `src/` or `lib/` directories respectively.
Please note that using 'Run all' or equivalent in the jupyter notebooks will generally not be useful. Some of the intervening steps in the notebooks are markdown cells instructing how to run programs via shell, separately. These programs are time consuming ones, often utilising multiprocessing in an HPC. Please run the notebooks, cell by cell, keeping this in mind.
Please note that using 'Run all' or equivalent in the jupyter notebooks will generally not be useful. Some of the intervening steps in the notebooks are markdown cells instructing how to run programs via shell, separately. These programs are time consuming ones, often utilising multiprocessing in an HPC. Please run the notebooks, cell by cell, keeping this in mind.
This study downloads and processes an number of large files, which were stored in the `data` directory, which is empty in this repo. If you follow/run the notebooks you will progressively fill the `data` directory to reproduce all the results as well a the figures in the paper. Alternatively, you can download this repo containing the results and figures in the `data` directory, from the Zenodo link in the paper.
This study downloads and processes an number of large files, which were stored in the `data` directory, which is empty in this repo. If you follow/run the notebooks you will progressively fill the `data` directory to reproduce all the results as well a the figures in the paper. Alternatively, you can download this repo containing the results and figures in the `data` directory, from the [Zenodo link](https://zenodo.org/records/14555036) in the paper.