Skip to content

Getting Started

This guide walks you through the full workflow: building your models, preparing training artifacts, and deploying everything to an Android device.


Prerequisites

  • Python 3.9+ with 16–32 GB of RAM recommended (model conversion is memory-intensive, depending on the model)
  • Android Studio installed
  • An Android device

Install the required Python dependencies before anything else:

pip install -r requirements-ort.txt

Part 1 — Setting Up the Android Libraries

The Android project requires pre-built ONNX Runtime native libraries to compile.

1. Download the Libraries

Download the library archive from the provided link. The archive contains:

File Purpose
jniLibs.zip ONNX Runtime native libraries for Android
Additional archives ONNX Runtime Training builds with NNAPI / XNNPack support (optional)

Building for a different target

The provided jniLibs are built for arm64-v8a and x86_64 — suitable for most physical Android phones. If you need a different ABI or a custom ORT build, refer to the official ONNX Runtime build-from-source guide.

2. Extract and Place the Libraries

Extract jniLibs.zip and place the resulting folder at:

android/ORTransformers/ORTransformersMobile/src/main/jniLibs/

The final layout should look like this:

jniLibs/
├── arm64-v8a/
│   └── *.so
└── x86_64/
    └── *.so

Part 2 — Building the Training Model

All model-related configuration lives in config.yml. Open it and set your paths and options before running any commands.

Step 1 — Convert to ONNX

Convert a model from a local path or Hugging Face Hub into ONNX format:

python -m trainer.builder --config config.yml

Output is written to build/train_models/ in ONNX format.

Step 2 — Generate Training Artifacts

Convert the ONNX model into on-device training artifacts:

python -m artifact.onnx_builder --config config.yml

Output is written to build/train/. This directory contains everything the Android application needs to run on-device fine-tuning.


Part 3 — Building the Inference Model

Step 1 — Convert to ONNX

python -m inference.builder --config config.yml

Step 2 — Generate Inference Artifacts

python -m artifact.onnx_builder --config config.yml

The inference artifacts are also written under build/.


Part 4 — Deploying to an Android Device

Once you have your model artifacts, push them onto the device using Android Studio.

1. Build the Application

Open the project in Android Studio and build it with the native libraries in place (see Part 1).

2. Push Model Files via Device Explorer

In Android Studio, open View → Tool Windows → Device Explorer and navigate to the application's private storage directory:

data/data/com.martinkorelic.ortmobile/files/

3. Create Your Model Repository Folder

Create a folder with a name of your choice (this becomes your repository name) and copy your model artifacts into it. The expected structure is:

repositoryName/
├── train/          # Training artifacts (from build/train/)
├── inference/      # Inference artifacts
├── tokenizer/      # Tokenizer files
├── embedding/      # (Optional) Embedding model for on-device RAG
└── database/       # ObjectBox local vector database

Only copy the folders you actually need — for example, if you are running inference only, you can omit train/.


What's Next?

  • Read about supported On-device PEFT Methods such as LoRA and MARS.
  • Check config.yml comments for a full description of every configuration option.

Stability notice

The framework is under active development. Most core features work reliably, though some advanced scenarios — particularly around custom PEFT configurations and certain model architectures — may occasionally require minor adjustments.