On-device text generation app using GPT-2 or DistilGPT2 (same distillation process than DistilBERT, 2x faster and 33% smaller than GPT-2).
Check the associated On-Device Machine Learning: Text Generation on Android article for more information on the how-to!
Available models:
- "original" converted small GPT-2 (472MB)
- FP16 post-training-quantized small GPT-2 (237MB)
- 8-bit precision weights post-training-quantized small GPT-2 (119MB)
- "original" converted DistilGPT2, a distilled version of GPT-2 (310MB)
- If you don't have already, install Android Studio, following the instructions on the website.
- Android Studio 3.2 or later.
- You need an Android device or Android emulator and Android development environment with minimum API 26.
- Open Android Studio, and from the Welcome screen, select
Open an existing Android Studio project
. - From the Open File or Project window that appears, select the directory where you cloned this repo.
- You may also need to install various platforms and tools according to error messages.
- If it asks you to use Instant Run, click Proceed Without Instant Run.
- You need to have an Android device plugged in with developer options enabled at this point. See here for more details on setting up developer devices.
- If you already have Android emulator installed in Android Studio, select a virtual device with minimum API 26.
- Be sure the
gpt2
configuration is selected - Click
Run
to run the demo app on your Android device.
From the repository root location:
- Use the following command to build a demo apk:
./gradlew :gpt2:build
- Use the following command to install the apk onto your connected device:
adb install gpt2/build/outputs/apk/debug/gpt2-debug.apk
To choose which model to use in the app:
- Remove/rename the current
model.tflite
file insrc/main/assets
- Comment/uncomment the model to download in the
download.gradle
config file:
"https://s3.amazonaws.com/models.huggingface.co/bert/gpt2-64-fp16.tflite": "model.tflite", // <- fp16 quantized gpt-2 (small) (default)
// "https://s3.amazonaws.com/models.huggingface.co/bert/gpt2-64-8bits.tflite": "model.tflite", // <- 8-bit integers quantized gpt-2 (small)
// "https://s3.amazonaws.com/models.huggingface.co/bert/gpt2-64.tflite": "model.tflite", // <- "original" gpt-2 (small)
// "https://s3.amazonaws.com/models.huggingface.co/bert/distilgpt2-64.tflite": "model.tflite", // <- distilled version of gpt-2 (small)