ES | EU | EN | FR

EUSKORPUS

The great digital library of the Basque language.

We are developing the digital library that machines need to understand and speak Basque. We generate massive corpora and open source models to ensure a functional and competitive Basque language.

DESCRIPTION

Digital corpus.

A digital corpus is like an infinite library, but for training artificial intelligence. It includes everything from everyday conversations to specialized texts.

The project contributes to the preservation and maintenance of the Basque language in digital environments.

Why is it vital?

Because without data, there is no AI. And without AI, Basque is left off the digital map. Euskorpus is the foundation that will enable the development of voice assistants, machine translators, chatbots, and a thousand other applications in Basque, while promoting a positive impact on both the industrial fabric and the social sphere, and aligning with the European framework for digital linguistic resources.

THE 3 PHASES

A clear plan.
A guaranteed impact.

Phase 1.
Generation.

We collect and label rich and diverse content.

Phase 2.
Training.

We develop open-source AI models.

Phase 3.
Transfer.

We put them at the service of industry and society.

FILE NUMBER
TITLE
AMOUNT
DURATION
(N. 2025/00744)(A/20250244)
Direct grant agreement from the Euskorpora association for the implementation of initiatives in the field of Basque language technologies, within the framework of the Euskorpus project.
10.550.000€
2025 – 2027

Mikeletegi Pasealekua 65
20009 Donostia / San Sebastián
Gipuzkoa - SPAIN

+34 611 02 81 72 
info@euskorpora.eus