AI Coding Assistants Integration
Introduction
The CERIT-SC AI infrastructure exposes Large Language Models (LLMs) through standard API protocols, allowing you to integrate powerful AI assistance directly into your local development environment. By connecting your tools to our backend, you can leverage high-performance models (such as qwen3-coder or gpt-oss-120b) for coding tasks without needing to run them on your own hardware or rely on external commercial providers.
This chapter describes how to configure two popular tools to communicate with our API.
Prerequisite: Before proceeding, ensure you have generated an API key from the AI Chat WebUI. You will need this key to authenticate your client.
Integrate with Visual Studio Code
The AI chatbot can be integrated with Visual Studio Code. For this purpose, a 3rd part extension Continue is used. With this extension, you can engage AI in miscellaneous roles that can help you while coding in Visual Studio Code. The roles include simple chat, agent mode, or autocomplete. While the chat can provide a familiar conversation, the agent mode lets you analyze or edit files opened in your project. Finally, the autocomplete role suggests code as you write.
Install the Continue extension
While Visual Studio Code is running, open the Extensions tab (Ctrl+Shift+X) and search for Continue. Click on the extension and then click on the Install button. After the installation is complete, you can access the Continue extension by clicking on the Continue icon in the left sidebar.
Configure the Continue extension
First, the configuration file config.yaml must be edited. Access the Continue extension within Visual Studio Code using the icon in the left sidebar.
- Click
Open settingsin the top-right corner of theContinueextension window. - Click
Configs. - Click
Open configurationicon at the end of the lineLocal Config. - Once the
config.yamlis opened, use the following configuration with your own<api-key>(guide above):
%YAML 1.1
---
name: Local Assistant
version: 1.0.0
schema: v1
model_defaults: &model_defaults
provider: openai
apiKey: <api-key>
apiBase: https://llm.ai.e-infra.cz/v1
models:
- name: autocomplete-coder
<<: *model_defaults
model: qwen3-coder
promptTemplates:
autocomplete: '<|fim_prefix|>{{{ prefix }}}<|fim_suffix|>{{{ suffix }}}<|fim_middle|>'
autocompleteOptions:
transform: false
defaultCompletionOptions:
temperature: 0.6
maxTokens: 512
roles:
- autocomplete
- name: chat-coder
<<: *model_defaults
model: qwen3-coder
env:
useLegacyCompletionsEndpoint: false
roles:
- chat
- edit
context:
- provider: code
- provider: docs
- provider: diff
- provider: terminal
- provider: problems
- provider: folder
- provider: codebase- Save the file. The new configuration should apply immediately.
- Check that the
FIMautocomplete is used by looking at theContinuebutton in the Visual Studio Code status bar (in the bottom-right corner). It should displayContinue. It should not displayContinue (NE). IfContinue (NE)is shown, press this button and select the choiceUse FIM autocomplete over Next Edit.
Usage of AI in Visual Studio Code
- The chat can be accessed by pressing the
Continueicon in the left sidebar. - In the chat, you can engage the agent mode by asking to analyze, explain, or edit the file in the currently open project. This action will require additional permissions, such as read/write permissions to the related files. You need to grant the necessary permissions to get the agent to do the job.
- Autocomplete feature will continuously suggest new code as you write. Once you see a suggestion, pressing
Tabaccepts the suggestion. The responsiveness of the suggestions depends on the model’s speed. You can change the modelqwen3-coderin the autocomplete config section togpt-oss-120bto get a bit faster responses, but the default model performs better in coding tasks.
Caveats
Disable all other Visual Studio Code extensions that provide AI autocomplete features. Otherwise, the Continue extension may not work properly.
Claude Code
You can deploy Claude Code and configure it to work with our models by pointing it to our API endpoint.
Installation
First, install Claude Code for your operating system by following the official instructions in the upstream repository:
Make sure the claude CLI is available in your $PATH after installation.
Configuration
Claude Code is configured using environment variables. Export the following variables in your shell:
export ANTHROPIC_BASE_URL="https://llm.ai.e-infra.cz/"
export ANTHROPIC_AUTH_TOKEN="sk-..."
export ANTHROPIC_MODEL="qwen3-coder"
export ANTHROPIC_DEFAULT_OPUS_MODEL="qwen3-coder"
export ANTHROPIC_DEFAULT_SONNET_MODEL="qwen3-coder"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="gpt-oss-120b"
export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1Alternatively, you can define these environment variables in the settings file ~/.claude/settings.json:
{
"permissions": {
"defaultMode": "acceptEdits"
},
"env": {
"ANTHROPIC_BASE_URL": "https://llm.ai.e-infra.cz/",
"ANTHROPIC_AUTH_TOKEN": "sk-...",
"ANTHROPIC_MODEL": "qwen3-coder",
"ANTHROPIC_DEFAULT_OPUS_MODEL": "qwen3-coder",
"ANTHROPIC_DEFAULT_SONNET_MODEL": "qwen3-coder",
"ANTHROPIC_DEFAULT_HAIKU_MODEL": "gpt-oss-120b",
"CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": "1"
}
}Variable description:
ANTHROPIC_BASE_URL– Base URL of our LLM API.ANTHROPIC_AUTH_TOKEN– Your API token obtained from https://chat.ai.e-infra.cz.ANTHROPIC_MODEL– Default model to use when running Claude Code.ANTHROPIC_DEFAULT_OPUS_MODEL– Default model to use when running Claude Code for reasoning and complex tasks.ANTHROPIC_DEFAULT_SONNET_MODEL– Default model to use when running Claude Code for reasoning and less complex tasks.ANTHROPIC_DEFAULT_HAIKU_MODEL– Default model to use for simple tasks.CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC– Disables to send telemetry and various reporting (not used with non-anthropics API).
Running Claude Code
Once the environment variables are set, start Claude Code with:
claude --model qwen3-coderYou should now be able to interact with Claude Code using our backend and selected model.
You can choose any of our available models, e.g., devstral-2. However, not all models are guaranteed to work correctly with Claude Code, esp. DeepSeek-R1 currently does not work and returns error: Internal server error: can only concatenate str (not "dict") to str.
Last updated on
