Skip to main content

General Questions

What is GladeCore?

GladeCore is an on-device AI system for games. It lets players hold dynamic, unscripted conversations with NPCs via text or voice. NPC personalities, knowledge, and voices are defined with editor-friendly Data Assets - no external services required. Because inference runs locally on the player’s device, you get near-zero latency and zero per-interaction costs. Core capabilities
  • Local LLM dialogue (text→text)
  • Optional speech I/O (STT for mic input, TTS for NPC voice)
  • Data-driven personalities & world lore
  • Works fully offline; no cloud round-trips
  • Designed and battle-tested by game devs, for game devs

_How is GladeCore different from existing tools like InWorld and ConvAI? _

GladeCore is not a traditional cloud platform - it is built to ship directly inside your game. This means: 
  • No per-token costs: Inference is local, so you’re not metered per call.
  • Low latency: No network hops. Interactions feel snappy and immersive.
  • Privacy by default: Player data never has to leave the device. Privacy compliance. 
  • Editor-native workflow: Personalities, lore, and voices are Data Assets you ship; no web dashboards required.
  • Modular: Use only what you need (text LLM only, or add STT/TTS).
  • Own your experience: You control prompts, safety rails, memory, and content policy.

_What LLM does GladeCore use? Can I customize / switch the model? _

By default, GladeCore ships with a Llama 3.2 ~1B backbone fine-tuned (via LoRA) on purpose-built emotional/gameplay data. This provides strong conversational quality while staying small enough to run on consumer hardware. And yes, you can swap models. In Unreal Engine for example: 
  • Provide a .gguf model file and a chat template.
  • In Unreal, set your template via CustomChatTemplate (e.g., to use Qwen, Phi, etc.).
Licensing note: If you redistribute Meta Llama models, include the required license/notice files and attributions (e.g., “Built with Llama”) per their terms. If you prefer to avoid those obligations, use another supported model and template.

How much conversation context can AI NPCs keep?

Typical configs run 2,048 to 4,096 tokens of context, which fits ~30 to 60 short turns depending on your system prompt and style. You can raise or lower context to trade quality vs. memory/latency. We also provide world-lore Data Assets for shared knowledge that doesn’t need to sit in the rolling chat buffer.

Will adding more NPCs slow things down?

Not in the usual case. NPCs are data assets (compact .uassets). At runtime, a single local model instance swaps between NPCs as the player engages them. Load times remain negligible even with hundreds of NPCs. If you want simultaneous multi-NPC conversations, that implies multiple concurrent inference contexts (and potentially multiple model instances) which is possible but not recommended for most targets due to performance costs.

Setup & Compatibility

What engines, platforms, and UE versions are supported?

Unreal Engine: Versions 5.2 to 5.6 (Windows target & dev platforms today) Unity: In progress; we’re aligning feature parity with our UE SDK. Targets: PC is first-class. Mobile/console are achievable with smaller models/quantization and context tuning; this is in progress with our Unity development. 

How does GladeCore impact game performance?

With our model you can actually keep FPS solid while killing API costs. Our build runs a ~1B-param model under ~2 GB RAM at 4-bit/8-bit mixes. That’s well within budgets on PC and current consoles, and it’s under what big mobile titles reserve on modern phones. We also offload layers onto the GPU and CPU to split the work. Inference runs on low-priority/async queues so the renderer isn’t starved. Another key issue using frontier models / cloud infrastructure is pure latency. Even fast cloud endpoints can add hundreds of ms at busy times. Of course, it depends on what you are using it for but for things like dialogue, latency is especially important.

Does GladeCore work offline?

Yes. Everything (LLM + optional STT/TTS) can run fully offline. You can also mix: e.g., local LLM + cloud TTS (ElevenLabs) or vice versa.

What speech options are available?

STT (Speech-to-Text): Multilingual, on-device model by default. TTS (Text-to-Speech): Local Piper voices or ElevenLabs API. You can switch per-NPC or per-platform.

Common Bugs (Unreal Engine)

Plugin GladeCore failed to load because module GladeCore could not be loaded?

By following all the steps on our documentation, this error should not pop up: https://docs.gladecore.com/unreal/installation Fix this error by checking the following: 
  • Make sure the Project you are integrating GladeCore in is a C++ Project
  • Make sure to copy the downloaded plugin into your project directory’s Plugins folder (if there is no Plugins folder yet, make one). Then rename the copied folder to GladeCore, so your folder structure should be, for example: MyProject/Plugins/GladeCore. After you do this, also delete your project’s Binaries and Intermediate folder and rebuild. 
  • Your default source code is set to an IDE that is supported by Unreal Engine (such as Visual Studio or JetBrains Rider), not Visual Studio Code 
On newer UE versions, disable NNE plugins when prompted. Metahuman Creator/editor plugins can sometimes conflict - disable runtime-editor creators you don’t need in-project.