This repository archives my notes and materials during my computer science self-learning jouney. Currently, I mainly focus on LLM/VLM inference engine and GPU/NPU computing, thus I have gathered many technical blogs for AI infra beginners and MLSys papers for researchers.
🔍 Contents:
Welcome to star this repository! 😊
- Programming Languages:
- Data Structure & Algorithm:
- Network
- Operating System
- Design Pattern
- Mathematics:
- Deep Learning:
- LLM:
- Multi-Modality:
- AI Infra:
- Roadmap
- Backend Development:
- Big Data Development:
- Open Source Best Practices
- Research:
- Employment:
| Title | Category | Author | Note | Rec | Read |
|---|---|---|---|---|---|
| PyTorch 显存管理介绍与源码解析(一) | Memory | @kaiyuan | ⭐️⭐️⭐️⭐️ | ✅ | |
| PyTorch 显存可视化与 Snapshot 数据分析 | Memory | @kaiyuan | ⭐️⭐️⭐️⭐️ | ✅ | |
A guide on good usage of non_blocking and pin_memory() in PyTorch |
Data Transfer | @Vincent Moens | ⭐️⭐️⭐️⭐️⭐️ | ✅ |
| Title | Category | Author | Note | Rec | Read |
|---|---|---|---|---|---|
| CUDA 内核优化策略 | Performance | @Zhang | ⭐️⭐️⭐️ | ✅ | |
| 从啥也不会到 CUDA GEMM 优化 | Performance | @猛猿 | ⭐️⭐️⭐️⭐️ | ✅ |
| Title | Category | Author | Note | Rec | Read |
|---|---|---|---|---|---|
| NCCL: Collective Operations | Collective Communication | @NVIDIA Developer | 集合通信常用操作 | ⭐️⭐️⭐️⭐️ | ✅ |
| 一文读懂|RDMA 原理 | Network | @Linux内核库 | ⭐️⭐️⭐️⭐️ | ✅ | |
| PyTorch 中基于 CUDA IPC 的进程间 Tensor 共享简介 | IPC | @kaiyuan | ⭐️⭐️⭐️ | ✅ |
| Title | Category | Author | Note | Rec | Read |
|---|---|---|---|---|---|
| 多模态技术梳理:ViT 系列 | ViT | @姜富春 | ViT 研究综述 | ⭐️⭐️⭐️ | ✅ |
| LLaVA 系列模型结构详解 | ViT | @Zhang | ⭐️⭐️⭐️ | ✅ | |
| 文生图模型之 Stable Diffusion | Diffusion | @小小将 | ⭐️⭐️⭐️⭐️ | ✅ | |
| DiT 推理加速综述: Caching | Diffusion | @DefTruth | ⭐️⭐️⭐️⭐️ | ✅ |
| Title | Category | Author | Note | Rec | Read |
|---|---|---|---|---|---|
| 多模态技术梳理:Qwen-VL 系列 | VL | @姜富春 | ⭐️⭐️⭐️⭐️ | ✅ | |
| Qwen2-VL 源码解读:从准备一条样本到模型生成全流程图解 | VL | @姜富春 | ⭐️⭐️⭐️⭐️⭐️ | ✅ | |
| 万字长文图解 Qwen2.5-VL 实现细节 | VL | @猛猿 | ⭐️⭐️⭐️⭐️⭐️ | ✅ | |
| Qwen3-VL 解剖 | VL | @Plunck | ⭐️⭐️⭐️ | ✅ | |
| Qwen3-Omni:统一多模态大模型再升级 | Omni | @tomsheep | |||
| Gated Attention:Qwen3-Next 背后的门控机制 | Attention | @Plunck | |||
| Qwen3.5 解剖 | @Plunck |
| Title | Category | Author | Note | Rec | Read |
|---|---|---|---|---|---|
| DeepSeek 技术解读(1)- 彻底理解 MLA(Multi-Head Latent Attention) | Attention | @姜富春 | ⭐️⭐️⭐️⭐️ | ✅ | |
| DeepSeek 技术解读(2)- MTP(Multi-Token Prediction)的前世今生 | Decoding | @姜富春 | ⭐️⭐️⭐️⭐️ | ✅ | |
| DeepSeek 技术解读(3)- MoE 的演进之路 | MoE | @姜富春 | ⭐️⭐️⭐️⭐️ | ✅ |
| Title | Category | Author | Note | Rec | Read |
|---|---|---|---|---|---|
| LLM Inference 高效 Debug 方法汇总 | Debug | @CarryPls | ⭐️⭐️ | ✅ | |
| 推理性能优化:GPU/NPU Profiling 阅读引导 | Profiling | @kaiyuan | ⭐️⭐️⭐️⭐️ | ✅ |
Refer to How to Read a Paper to master a practical and efficient three-pass method for reading papers.
| Project | Category | Author/Organization | About |
|---|---|---|---|
| llm-action | MLSys | @liguodongiot | 本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)。 |
| awesomeMLSys | MLSys | @GPU MODE | An ML Systems Onboarding list. |
| InfraTech | MLSys | @CalvinXKY | 分享AI Infra知识&代码练习:PyTorch/vLLM/SGLang框架入门⚡️、性能加速🚀、大模型基础🧠、AI软硬件🔧等。 |
| AI-Infra-from-Zero-to-Hero | MLSys | @HuaizhengZhang | 🚀 Awesome System for Machine Learning ⚡️ AI System Papers and Industry Practice. ⚡️ System for Machine Learning, LLM (Large Language Model), GenAI (Generative AI). 🍻 OSDI, NSDI, SIGCOMM, SoCC, MLSys, etc. 🗃️ Llama3, Mistral, etc. 🧑💻 Video Tutorials. |
| resource-stream | CUDA | @GPU MODE | GPU programming related news and material links. |
| BasicCUDA | CUDA | @CalvinXKY | A tutorial for CUDA & PyTorch. |
@misc{cs-self-learning@2023,
title = {cs-self-learning},
url = {https://github.com/shen-shanshan/cs-self-learning},
note = {Open-source software available at https://github.com/shen-shanshan/cs-self-learning},
author = {shen-shanshan},
year = {2023}
}MIT License, find more details here.