Technology & Innovation

Austria’s Ora Computing secures €3.5 million to make AI models smaller and faster

Vienna-based Ora Computing, a startup specialising in optimising and compressing AI foundation models, today announced the close of a €3.5 million Seed round in order to grow the team, extend compression capabilities to the largest frontier models, and launch a commercial product for cloud inference providers and companies deploying AI.

  • David Cendon Garcia
  • June 24, 2026
  • 0 Comments

Vienna-based Ora Computing, a startup specialising in optimising and compressing AI foundation models, today announced the close of a €3.5 million Seed round in order to grow the team, extend compression capabilities to the largest frontier models, and launch a commercial product for cloud inference providers and companies deploying AI.

The round was led by Constructor Capital and Greencode Ventures, with continued backing from foundational investor XISTA Science Ventures, who helped build and launch the company.

We founded Ora Computing to challenge the assumption that massive scale is needed to reach useful intelligence. We believe that the next wave of AI adoption will be driven by more compact models that are highly efficient and optimised for specific use cases rather than large general purpose cloud models. Ora is building the software and algorithm stack that enables this transition,” says Stefan Sack, CEO and co-founder of Ora Computing.

Ora Computing’sSeed round sits at the smaller end of the 2026 funding activity around Europe’s AI infrastructure and deployment stack.

Larger rounds have gone into compute capacity and data-centre infrastructure, including Mistral AI, Nscale and Verda, while smaller Seed and pre-Seed rounds have targeted enabling software layers such as AI memory, agent governance, licensed data access and compression.

Ora’s focus on reducing model size and inference cost is therefore aligned with a broader 2026 pattern: capital is moving not only into building more AI compute, but also into technologies intended to make AI systems cheaper, more deployable and more efficient to run.

AI’s energy appetite is growing faster than the world can build the infrastructure to feed it. One key approach is to make AI itself more efficient, and that is exactly what Ora does. Compressing models radically without sacrificing accuracy makes a tremendous difference to their customers,” says Terhi Vapola, Founder and Managing Partner of Greencode Ventures.

Founded in 2024, Ora Computing is building an AI model compression and optimisation stack that reduces the memory footprint of large AI models by up to 80% and making them run up to four times faster

Ora was founded by Stefan Sack and Raimel Medina, both quantum computing researchers from the Serbyn group at the Institute of Science and Technology Austria (ISTA).

By making models dramatically smaller with minimal loss in accuracy, Ora looks to enable its customers to deploy AI locally on energy-efficient edge hardware rather than energy-hungry cloud infrastructure. For cloud deployments, smaller models translate directly into lower serving costs and higher throughput.

According to the company, AI inference – the process of actually running an AI model to generate outputs – has become a significant and fast-growing cost for any company deploying AI at scale. Major deployments can now cost tens of millions of euros per month in compute alone, and the problem compounds as models continue to grow in size.

For companies wanting to run AI locally on devices like cars or industrial equipment, the models are often simply too large to fit.

Because compressed models require significantly less compute power to run, the efficiency gains also translate directly into lower energy consumption and reduced carbon emissions: at 1% market penetration, Ora estimates its technology could eliminate more than 50,000 tonnes of CO2 annually.

Unlike existing compression tools, Ora says their approach works across different hardware types and drops directly into standard inference frameworks – no custom software layers, no capital-intensive retraining, no changes to existing infrastructure.

Where competing approaches force a binary choice between compression levels, Ora’s algorithm continuously maps the full tradeoff between model size and accuracy, letting companies optimise for their specific hardware and cost constraints.

Ora has tested this with a 70 billion parameter model compressed in hours at a compute cost of under $1000, compared to industry figures of hundreds of thousands of dollars for comparable work.

This post was originally published on this site.