Skip to content

[QUESTION] ko2kegg_abundance(): filter_for_prokaryotes does not remove Human Diseases pathways + questions on KEGG/MetaCyc databases #199

@Julousa

Description

@Julousa

Hi,

I’m encountering an issue with the filter_for_prokaryotes argument in the ko2kegg_abundance() function and would like to understand what is happening.
When I run the function with filter_for_prokaryotes = TRUE and FALSE, I obtain the same number of pathways in both cases (550 pathways). Pathways classified under Human Diseases are still present even when filter_for_prokaryotes = TRUE, whereas I expected them to be removed. I am using PICRUSt version 2.5.13. Do you have any idea why the filtering does not seem to have any effect?

I also have a couple of questions regarding databases and annotation of picrust2 output files:

Why does ko2kegg_abundance function return kos pathways that are subsequently not annotated during the pathway_annotation step? For example, I obtain the following warning message: WARN: KO ID ko99980 not found in KEGG database (HTTP 404)
Do these two functions rely on different versions of the KEGG database? I was thinking that KEGG was no longer an open-source database, so what is the actual KEGG version used by ggpicrust2? Are these functions rely on the paid KEGG database?

After pathway annotation, some pathways appear to be highly abundant in my dataset even though only a very small number of their constituent KOs are present (as visualized with KEGG Mapper), Do you know if a pathway is considered present as soon as one of its constituent KOs is detected? Is there a way to normalize pathway abundance based on the number of KOs detected in each of them?

Finally, PICRUSt2 mentions that the MetaCyc database is supposed to be an open-source alternative to KEGG, but when I try to access it this does not seem to be the case. I therefore have the same question as for KEGG: what is the actual MetaCyc version used by ggpicrust2? Is it possible that the package functions rely on the paid Metacyc database?

Sorry if these are basic questions, I am still a beginner with PICRUSt2, KEGG and Metacyc.

Thanks in advance for your help!

Best regards

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions