Skip to content

jmerelnyc/care-transformer

Repository files navigation

# care-transformer

Mobile-optimized linear visual transformer with decoupled dual interaction mechanism

pip install care-transformer


```python
import torch
from care_transformer import CareTransformer

model = CareTransformer(
    img_size=224,
    patch_size=16,
    in_chans=3,
    num_classes=1000,
    embed_dim=384,
    depth=12,
    num_heads=6,
    linear_attn=True
)

x = torch.randn(1, 3, 224, 224)
out = model(x)  # (1, 1000)

notes

Linear complexity attention for mobile deployment. Decouples spatial and channel interactions to reduce FLOPs without tanking accuracy.

Key differences from vanilla ViT:

  • O(n) attention instead of O(n²)
  • Dual pathway: spatial-then-channel vs channel-then-spatial
  • Optional grouped convolutions for token mixing
  • Quantization-friendly architecture

Benchmarks on ImageNet-1K:

Model Params FLOPs Top-1 Latency (mobile)
care-tiny 5M 1.2G 76.4% 12ms
care-small 12M 2.8G 81.2% 24ms
care-base 28M 6.1G 83.7% 48ms

Latency measured on Snapdragon 888, INT8 quantized.

from care_transformer import CareTransformer
from care_transformer.utils import export_onnx, quantize_model

model = CareTransformer.from_pretrained('care-small')

# Export for mobile
export_onnx(model, 'care_small.onnx', opset=13)

# Post-training quantization
qmodel = quantize_model(
    model,
    calibration_loader=train_loader,
    backend='qnnpack'
)

# Custom dual interaction block
from care_transformer.blocks import DualInteractionBlock

block = DualInteractionBlock(
    dim=384,
    num_heads=6,
    spatial_first=True,
    use_grouped_conv=True,
    conv_groups=4
)

feat = torch.randn(8, 196, 384)  # (B, N, C)
out = block(feat)

MIT


About

Mobile-optimized linear visual transformer with decoupled dual interaction mechanism

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors