AI tools (Claude, Perplexity, etcetera) are used in the project via chat interfaces. Usually, they help with brainstorming and alternative suggestions but some of the code may be directly taken from the suggestions.
Code used for laboratory analysis of flow cytometry and next-generation sequencing data provided by my institution's core facility (Biomicro Center).
Most important scripts are found in core_scripts/ directory.
git clone https://github.com/luised94/lab_utils.git
See docs/bmc_chip_seq_analysis_instructions.qmd for sequencing analysis.
README and documentation needs updating. Quick start is the easiest way to start for now.
Through development of the collection of scripts, I have gone back and forth wiht different organization styles. Right now, I have settled on just sourcing files during an interactive session since that captures the environment after the script since I can then try code on the variables and do quick debugging sessions.
Not all of the scripts are updated to this style yet!
Because I perform the next-generation sequencing analysis using my institution's cluster, I have to use the version of the tools that are installed there for the most part.
For this reason, I use R 4.2.0 to perform the analyses.
The analysis are done locally (WSL2 on Windows) or in a linux computing cluster. The linux cluster is the condition that dictates what dependencies are used, especially for the next generation sequencing analysis.
- R 4.2.0
- bash 4.2
- Command line utils
- bowtie2/2.3.5.1
- fastp/0.20.0
- fastqc/0.11.5
- deeptools/3.0.1
- gatk
- python/2.7
- miniforge
- macs2
- picard
- java
I assume scripts will be run while the current working directory is the repository root.
Most bash scripts can be used by running the script from the command line.
./script.sh <args>
Rscript script.R <args>
Most R scripts can be used by running the script from the command line.
R --no-save
# Start up message should show.
> source("core_scripts/<script_name>.R")
Most scripts output some sort of log file (stdout and stderr) that can be inspected with a text editor. The log files can usually be verified with vim ~/data/
/logs/9004526_1.out.Each documentation section has a troubleshooting section that lets the user know about common errors that could be encountered, such as the scripts depending on the name of the files.
Notes I take while developing the scripts.
IGNORE THIS SECTION. I have a set of tags that I try to use to put marks on code for future reference. The form of the tags is . recursive (-r) grep can be used to find the tags.
TODO: Tasks that I have to complete for that particular code file. HOWTO: Designates different code snippets for reference when I want to see how to do a particular thing. FIXME: Highlight areas that need fixing. NOTE: Add important notes or explanations. BUG: Mark known bugs or issues. OPTIMIZE: Indicate areas that could be optimized for better performance. REFACTOR: for code that needs refactoring TEST: for testing purposes