bytedance · Avicennasis · Jun 13, 2026
diff --git a/README.md b/README.md
@@ -35,7 +35,7 @@ ByteIR Compiler is an MLIR-based compiler for CPU/GPU/ASIC.
 
 ## [Runtime](runtime/README.md)
 
-ByteIR Runtime is a common, lightweight runtime, capable to serving both existing kernels and ByteIR compiler generated kernels.
+ByteIR Runtime is a common, lightweight runtime, capable of serving both existing kernels and ByteIR compiler generated kernels.
 
 ## [Frontends](frontends/README.md)
 

diff --git a/compiler/doc/gpu.md b/compiler/doc/gpu.md
@@ -12,7 +12,7 @@ ByteIR Compiler has developed multiple passes to support GPU backends.
 ### InsertTrivialSCFLoop Pass
 
 This pass simply inserts a trivial scf ForOp for scalar computation.
-It is typically used to simplify a pass pipeline without checking whether scalar computation later. 
+It is typically used to simplify a pass pipeline without checking for scalar computation later. 
 Note the effect of this pass will be removed after the scf canonicalizer.  
 
 ```
@@ -47,7 +47,7 @@ func.func @scalar_func(%arg0: memref<f32>) -> memref<f32> {
 ```
 
 ### ConvertFuncToGPU Pass
-This pass convert a FuncOp in a loop form into a GPUFuncOp in a SIMT form. 
+This pass converts a FuncOp in a loop form into a GPUFuncOp in a SIMT form. 
 Some loops with annotation in the source FuncOp are transformed into a SIMT statement. 
 
 ```

diff --git a/frontends/onnx-frontend/README.md b/frontends/onnx-frontend/README.md
@@ -93,13 +93,13 @@ onnx-frontend model.onnx -batch-size=1 -invokeOnnxVersionConverter -o model.stab
 ### How to Add ONNX-2-STABLEHLO conversion
 - Before you start
   - Build onnx-mlir, see [this](https://github.com/onnx/onnx-mlir/blob/main/docs/BuildOnLinuxOSX.md).
-  - Learn the basic knowledge of MLIR, especially the [Pattern Rewritten doc](https://mlir.llvm.org/docs/PatternRewriter/) and the [DDR doc](https://mlir.llvm.org/docs/DeclarativeRewrites/).
+  - Learn the basic knowledge of MLIR, especially the [Pattern Rewriter doc](https://mlir.llvm.org/docs/PatternRewriter/) and the [DDR doc](https://mlir.llvm.org/docs/DeclarativeRewrites/).
 
 - Workflow
   - Develop new patterns in third_party/onnx-mlir
   - Save changes into a new .patch file under `third_party/patches` folder
 
-- Steps to add a onnx-2-stablehlo convertion pattern
+- Steps to add an onnx-2-stablehlo conversion pattern
 
   - Take a look at the [definition of the onnx op you want to convert](https://github.com/onnx/onnx-mlir/blob/main/src/Dialect/ONNX/ONNXOps.td.inc), and onnx is more like a coarse-grained op collection.
   - Take a look at the [corresponding stablehlo op definition](https://github.com/openxla/stablehlo/blob/main/stablehlo/dialect/StablehloOps.td), and stablehlo is more like a fine-grained op collection.

diff --git a/frontends/tf-frontend/docs/developer_guild.md b/frontends/tf-frontend/docs/developer_guild.md
@@ -1,18 +1,18 @@
 # Contribute to TF-Frontend
 
-Here's some infomation which might be useful for developers.
+Here's some information which might be useful for developers.
 
 ## File Structure
 1. Upstream tensorflow is a submodule of this repo lying on `external/tensorflow`, it will be built together and some simple patches will be applied on tensorflow before compiling.
 2. Main source code:
     - tf_mlir_ext: used as an extension of the upstream tensorflow dialect.
         - byteir/: Contains source code shared with byteir compiler. Most of them are symbolic link files.
-        - ir/: Used to define extented operations of tensorflow dialect. Here TF.OpaqueOp is only used as an example to demostrate how to add tf operations outside the upstream tf repo. Note that we should add the extented operation when I have to, and it also need to be rationalized before adding it.
+        - ir/: Used to define extended operations of tensorflow dialect. Here TF.OpaqueOp is only used as an example to demonstrate how to add tf operations outside the upstream tf repo. Note that we should add the extended operation when I have to, and it also need to be rationalized before adding it.
         - transforms/: additional passes on tf dialect
         - tests/: lit tests for the additional passes
-        - pipelines/: a pass pipeline contructed to convert tf dialect to mhlo dialect
+        - pipelines/: a pass pipeline constructed to convert tf dialect to mhlo dialect
     - tools: 
-        - tf_frontend_main.cc: the main executable binary will be built from this, it is used convert a tf model of protobuf format to mhlo IR.
+        - tf_frontend_main.cc: the main executable binary will be built from this, it is used to convert a tf model of protobuf format to mhlo IR.
         - tf_ext_opt_main.cc: an debug targeted executable binary, always used to test a pass or some passes.
     - utils: some non-mlir utilities
 
@@ -23,4 +23,4 @@ bazel --output_user_root=./build test //tf_mlir_ext/tests:all --java_runtime_ver
 ```
 
 ## How to Debug
-The most frequent errors occur during the tf-frontend pass pipeline. If that happens, a typical debuging method is to add `-reproduce-file [path/to/file]` to the tf-frontend command. Then we could know which pass cause the error and the IR before that pass. After that, tf-ext-opt is recommended to be used to debug the specific pass. For more debugging tips, please refer to [mlir doc ](https://mlir.llvm.org/getting_started/Debugging/) or by typing `tf-frontend --help`.
+The most frequent errors occur during the tf-frontend pass pipeline. If that happens, a typical debugging method is to add `-reproduce-file [path/to/file]` to the tf-frontend command. Then we could know which pass causes the error and the IR before that pass. After that, tf-ext-opt is recommended to be used to debug the specific pass. For more debugging tips, please refer to [mlir doc ](https://mlir.llvm.org/getting_started/Debugging/) or by typing `tf-frontend --help`.
diff --git a/frontends/torch-frontend/doc/torch_2_0_training.md b/frontends/torch-frontend/doc/torch_2_0_training.md
@@ -1,5 +1,5 @@
 # ByteIR Compiler for Torch 2.0 Training
-ByteIR is a well optmized compiler for PyTorch 2.0. This doc will introduce our practice on accelerating PyTorch 2.0 on NVGPU by compilation technology.
+ByteIR is a well optimized compiler for PyTorch 2.0. This doc will introduce our practice on accelerating PyTorch 2.0 on NVGPU by compilation technology.
 
 ## Overview
 Overview of our compilation pipeline:
@@ -27,14 +27,14 @@ Codegen  AITemplate  Runtime Library
 
 ## Prepare Materials
 ### Environment
-We recommond the following environment for building:  
+We recommend the following environment for building:  
 * cuda>=11.8
 * python>=3.9
 * gcc>=8.5 or clang>=7  
 
 Or use our **[Dockerfile](../../../docker/Dockerfile)** to build an docker image.
 ### Build ByteIR Components
-See each components' README to build themself:  
+See each components' README to build themselves:  
 * [Torch Frontend](../README.md)
 * [ByteIR Compiler](../../../compiler/README.md)
 * [ByteIR Runtime](../../../runtime/README.md)

diff --git a/runtime/README.md b/runtime/README.md
@@ -60,7 +60,7 @@ python3 setup.py bdist_wheel
 ```
 
 ### Testing
-#### Linux/Max
+#### Linux/Mac
 ```bash
 cd ./runtime/build
 ./bin/brt_test_all

diff --git a/runtime/include/brt/backends/README.md b/runtime/include/brt/backends/README.md
@@ -7,8 +7,8 @@ Each HW backend folder typically has ```device``` and ```providers``` folders.
 The ```device``` folder contains the HW-specific classes, and the derived runtime abstract classes, such as ```Allocator``` and ```WorkQueue```.
 The ```providers``` folder contains at least one ```provider```, which represents a collection of op implementations. 
 Note a backend can have more than one ```provider```. For example, a ```CUDA``` backend can have one ```provider``` relying on custom libraries and another ```provider``` using TensorRT.
-Muliple ```providers``` can work together to execute a model. 
-Note if multiple ```providers``` all have implmentations for a given op, the op will be executed by the **first** (in terms of the registration order) ```provider```.
+Multiple ```providers``` can work together to execute a model. 
+Note if multiple ```providers``` all have implementations for a given op, the op will be executed by the **first** (in terms of the registration order) ```provider```.
 
 
 ### Device
@@ -24,10 +24,10 @@ Note the predefined contract may or may not always imply the real execution orde
 A ```provider``` typically implements a collection of op implementations, each of which is derived from ```OpKernel```, and register during construction of the ```provider```.
 
 ```OpKernel``` can specify execution logic for construction (coded in constructor), ```ProloguePerSession```, ```EpiloguePerSession```, ```ProloguePerFrame```, ```EpiloguePerFrame```, and ```Run```.
-The construciton happens per ```OpKernel``` instance; 
+The construction happens per ```OpKernel``` instance; 
 ```ProloguePerSession``` executes in the beginning of a session; 
 ```EpiloguePerSession``` executes in the end of a session; 
-```ProloguePerFrame``` execute in the beginning of a frame. Here, a frame is defined as an I\O workspace of inference or training; 
+```ProloguePerFrame``` executes in the beginning of a frame. Here, a frame is defined as an I/O workspace of inference or training; 
 ```EpiloguePerFrame``` executes in the end of a frame;
 ```Run``` executes per run. 
 Note if a frame is reused, say run twice, ```ProloguePerFrame``` and ```EpiloguePerFrame``` only execute once, while ```Run``` would execute twice.