New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Wanted: LLM Inference service providers in Europe
I'm looking for offers and pointers for LLM inference service via API, in particular from European API providers that adhere to GDPR. The LLMs offered via API should currently include Llama-3.1-70B-instruct FP16.
PS: For LLM providers in general, https://artificialanalysis.ai/ is a good resource.
PPS: I am aware of IONOS, Mistral, OVHCloud, Nebius, LightOn / Orange Business, T-Systems
Keywords: Llama, AI, GenAI, generative AI, vLLM, openAI API, tokens


Comments
TensorDock / @lentro
They seem to offer GPUs but no inference service - yet
IBM cloud
SGLang works great for most of the models, including Llama.
https://github.com/sgl-project/sglang
Out of curiosity, but why BF16?
Many benchmarks show that there are no differences between BF16 and FP8.
I know. They were offering free API access for Llama 3.1 8B hosted in their infrastructure, so maybe a custom plan? That’s the closest I’ve seen in this forum regarding your requirements, so it never hurts to ask.
Why not just use something like groqcloud?
Runpod.io but you have to develop to add models yourself although some models are pretty much out of the box already in their pod instances templates.
Given the current uncertainty about EU laws around AI I doubt there are many companies in this space.
https://cortecs.ai/