Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ ByteIR Compiler is an MLIR-based compiler for CPU/GPU/ASIC.

## [Runtime](runtime/README.md)

ByteIR Runtime is a common, lightweight runtime, capable to serving both existing kernels and ByteIR compiler generated kernels.
ByteIR Runtime is a common, lightweight runtime, capable of serving both existing kernels and ByteIR compiler generated kernels.

## [Frontends](frontends/README.md)

Expand Down
4 changes: 2 additions & 2 deletions compiler/doc/gpu.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ ByteIR Compiler has developed multiple passes to support GPU backends.
### InsertTrivialSCFLoop Pass

This pass simply inserts a trivial scf ForOp for scalar computation.
It is typically used to simplify a pass pipeline without checking whether scalar computation later.
It is typically used to simplify a pass pipeline without checking for scalar computation later.
Note the effect of this pass will be removed after the scf canonicalizer.

```
Expand Down Expand Up @@ -47,7 +47,7 @@ func.func @scalar_func(%arg0: memref<f32>) -> memref<f32> {
```

### ConvertFuncToGPU Pass
This pass convert a FuncOp in a loop form into a GPUFuncOp in a SIMT form.
This pass converts a FuncOp in a loop form into a GPUFuncOp in a SIMT form.
Some loops with annotation in the source FuncOp are transformed into a SIMT statement.

```
Expand Down
4 changes: 2 additions & 2 deletions frontends/onnx-frontend/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,13 +93,13 @@ onnx-frontend model.onnx -batch-size=1 -invokeOnnxVersionConverter -o model.stab
### How to Add ONNX-2-STABLEHLO conversion
- Before you start
- Build onnx-mlir, see [this](https://github.com/onnx/onnx-mlir/blob/main/docs/BuildOnLinuxOSX.md).
- Learn the basic knowledge of MLIR, especially the [Pattern Rewritten doc](https://mlir.llvm.org/docs/PatternRewriter/) and the [DDR doc](https://mlir.llvm.org/docs/DeclarativeRewrites/).
- Learn the basic knowledge of MLIR, especially the [Pattern Rewriter doc](https://mlir.llvm.org/docs/PatternRewriter/) and the [DDR doc](https://mlir.llvm.org/docs/DeclarativeRewrites/).

- Workflow
- Develop new patterns in third_party/onnx-mlir
- Save changes into a new .patch file under `third_party/patches` folder

- Steps to add a onnx-2-stablehlo convertion pattern
- Steps to add an onnx-2-stablehlo conversion pattern

- Take a look at the [definition of the onnx op you want to convert](https://github.com/onnx/onnx-mlir/blob/main/src/Dialect/ONNX/ONNXOps.td.inc), and onnx is more like a coarse-grained op collection.
- Take a look at the [corresponding stablehlo op definition](https://github.com/openxla/stablehlo/blob/main/stablehlo/dialect/StablehloOps.td), and stablehlo is more like a fine-grained op collection.
Expand Down
10 changes: 5 additions & 5 deletions frontends/tf-frontend/docs/developer_guild.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
# Contribute to TF-Frontend

Here's some infomation which might be useful for developers.
Here's some information which might be useful for developers.

## File Structure
1. Upstream tensorflow is a submodule of this repo lying on `external/tensorflow`, it will be built together and some simple patches will be applied on tensorflow before compiling.
2. Main source code:
- tf_mlir_ext: used as an extension of the upstream tensorflow dialect.
- byteir/: Contains source code shared with byteir compiler. Most of them are symbolic link files.
- ir/: Used to define extented operations of tensorflow dialect. Here TF.OpaqueOp is only used as an example to demostrate how to add tf operations outside the upstream tf repo. Note that we should add the extented operation when I have to, and it also need to be rationalized before adding it.
- ir/: Used to define extended operations of tensorflow dialect. Here TF.OpaqueOp is only used as an example to demonstrate how to add tf operations outside the upstream tf repo. Note that we should add the extended operation when I have to, and it also need to be rationalized before adding it.
- transforms/: additional passes on tf dialect
- tests/: lit tests for the additional passes
- pipelines/: a pass pipeline contructed to convert tf dialect to mhlo dialect
- pipelines/: a pass pipeline constructed to convert tf dialect to mhlo dialect
- tools:
- tf_frontend_main.cc: the main executable binary will be built from this, it is used convert a tf model of protobuf format to mhlo IR.
- tf_frontend_main.cc: the main executable binary will be built from this, it is used to convert a tf model of protobuf format to mhlo IR.
- tf_ext_opt_main.cc: an debug targeted executable binary, always used to test a pass or some passes.
- utils: some non-mlir utilities

Expand All @@ -23,4 +23,4 @@ bazel --output_user_root=./build test //tf_mlir_ext/tests:all --java_runtime_ver
```

## How to Debug
The most frequent errors occur during the tf-frontend pass pipeline. If that happens, a typical debuging method is to add `-reproduce-file [path/to/file]` to the tf-frontend command. Then we could know which pass cause the error and the IR before that pass. After that, tf-ext-opt is recommended to be used to debug the specific pass. For more debugging tips, please refer to [mlir doc ](https://mlir.llvm.org/getting_started/Debugging/) or by typing `tf-frontend --help`.
The most frequent errors occur during the tf-frontend pass pipeline. If that happens, a typical debugging method is to add `-reproduce-file [path/to/file]` to the tf-frontend command. Then we could know which pass causes the error and the IR before that pass. After that, tf-ext-opt is recommended to be used to debug the specific pass. For more debugging tips, please refer to [mlir doc ](https://mlir.llvm.org/getting_started/Debugging/) or by typing `tf-frontend --help`.
6 changes: 3 additions & 3 deletions frontends/torch-frontend/doc/torch_2_0_training.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# ByteIR Compiler for Torch 2.0 Training
ByteIR is a well optmized compiler for PyTorch 2.0. This doc will introduce our practice on accelerating PyTorch 2.0 on NVGPU by compilation technology.
ByteIR is a well optimized compiler for PyTorch 2.0. This doc will introduce our practice on accelerating PyTorch 2.0 on NVGPU by compilation technology.

## Overview
Overview of our compilation pipeline:
Expand Down Expand Up @@ -27,14 +27,14 @@ Codegen AITemplate Runtime Library

## Prepare Materials
### Environment
We recommond the following environment for building:
We recommend the following environment for building:
* cuda>=11.8
* python>=3.9
* gcc>=8.5 or clang>=7

Or use our **[Dockerfile](../../../docker/Dockerfile)** to build an docker image.
### Build ByteIR Components
See each components' README to build themself:
See each components' README to build themselves:
* [Torch Frontend](../README.md)
* [ByteIR Compiler](../../../compiler/README.md)
* [ByteIR Runtime](../../../runtime/README.md)
Expand Down
2 changes: 1 addition & 1 deletion runtime/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ python3 setup.py bdist_wheel
```

### Testing
#### Linux/Max
#### Linux/Mac
```bash
cd ./runtime/build
./bin/brt_test_all
Expand Down
8 changes: 4 additions & 4 deletions runtime/include/brt/backends/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@ Each HW backend folder typically has ```device``` and ```providers``` folders.
The ```device``` folder contains the HW-specific classes, and the derived runtime abstract classes, such as ```Allocator``` and ```WorkQueue```.
The ```providers``` folder contains at least one ```provider```, which represents a collection of op implementations.
Note a backend can have more than one ```provider```. For example, a ```CUDA``` backend can have one ```provider``` relying on custom libraries and another ```provider``` using TensorRT.
Muliple ```providers``` can work together to execute a model.
Note if multiple ```providers``` all have implmentations for a given op, the op will be executed by the **first** (in terms of the registration order) ```provider```.
Multiple ```providers``` can work together to execute a model.
Note if multiple ```providers``` all have implementations for a given op, the op will be executed by the **first** (in terms of the registration order) ```provider```.


### Device
Expand All @@ -24,10 +24,10 @@ Note the predefined contract may or may not always imply the real execution orde
A ```provider``` typically implements a collection of op implementations, each of which is derived from ```OpKernel```, and register during construction of the ```provider```.

```OpKernel``` can specify execution logic for construction (coded in constructor), ```ProloguePerSession```, ```EpiloguePerSession```, ```ProloguePerFrame```, ```EpiloguePerFrame```, and ```Run```.
The construciton happens per ```OpKernel``` instance;
The construction happens per ```OpKernel``` instance;
```ProloguePerSession``` executes in the beginning of a session;
```EpiloguePerSession``` executes in the end of a session;
```ProloguePerFrame``` execute in the beginning of a frame. Here, a frame is defined as an I\O workspace of inference or training;
```ProloguePerFrame``` executes in the beginning of a frame. Here, a frame is defined as an I/O workspace of inference or training;
```EpiloguePerFrame``` executes in the end of a frame;
```Run``` executes per run.
Note if a frame is reused, say run twice, ```ProloguePerFrame``` and ```EpiloguePerFrame``` only execute once, while ```Run``` would execute twice.
Expand Down