News
Iscte, through the Information Sciences, Technologies and Architecture Research Centre (ISTAR-Iscte), has recently secured the ORAL – kriOl(u) laRge lAnguage modeLs project, a consortium initiative with the University of Cape Verde (UniCV), co-funded by the Organisation of Ibero-American States (OEI), under its 2025 Support Fund – OEI-Portugal, aimed at the development of platforms and technological resources geared towards multilingualism. The project will run for 15 months, starting on 15 October, with a total budget of USD 72,000.
The project seeks to ensure that speakers of Cape Verdean Creole (ISO: kea) – a Portuguese-based language that emerged in the early 15th century and is spoken both in Cape Verde and in the diaspora – can benefit from the advances of digital transformation in their mother tongue.
The initiative also aims to support public policies that promote the recognition, standardisation and officialisation of Cape Verdean Creole, in close cooperation with Cape Verdean and Portuguese institutions responsible for linguistic development and for the integration of the language into the international digital transformation landscape. Among these are the Ministry of Education, the Ministry of Culture and Creative Industries, and Camões – Institute for Cooperation and Language, as well as civil society organisations such as the Cape Verdean Mother Language Association (ALMA-CV).
The project will make available, on an open-access and open-source basis, the first linguistic resources and natural language processing tools for Cape Verdean Creole, which until now have not existed. These include:
- Creole text corpora (Santiago and São Vicente varieties);
- Parallel Creole ↔ European Portuguese corpora;
- The first large-scale language model (LLM) for generating written text and dialogue in Creole;
- The first large-scale language model for bidirectional translation Creole ↔ European Portuguese;
- A Creole phonetic glossary and demonstration web application;
- Creole speech corpora (Santiago and São Vicente varieties);
- The first speech recognition system for Creole;
- A chatbot for written interaction in Creole;
- A voicebot for spoken interaction in Creole (with speech recognition);
- Public APIs for programmatic access to these resources, models and corpora, published on the Hugging Face platform.
In addition to technological development, the project includes training initiatives for civil servants, teachers, researchers, students, entrepreneurs and citizens with special needs, promoting the integration of these tools into organisational and business processes. The social and institutional impact of the use of these technologies will also be assessed.
On Iscte’s side, the project is led by Miguel Sales Dias, Deputy Director of Iscte-Sintra and ISTAR-Iscte, and António Raimundo, lecturer at Iscte-Sintra and integrated researcher at ISTAR-Iscte. At the University of Cape Verde, the project is coordinated by Prof. Dominika Swolkien and Prof. Ana Karina Moreira.