Agile and robust human language technologies for defence – Participation to a technological challenge
European Comission
Objective :
With the digitalisation of the battlefield, which leads to more and more complex user interfaces and to ever-increasing volumes of language data to process, language technologies such as multilingual written or spoken interaction, translation, and information retrieval are needed in an increasing number of defence systems, especially for C4ISR and joint multinational and/or peacekeeping operations.
These technologies have been the subject of much research for several decades, which has led to impressive improvements for some applications such as voice assistants, semi-automated call centres, online translation services, etc. However, even if most of the techniques used in current systems have emerged from defence research, these improvements also rely on the availability of huge amounts of data, typically by internet actors with a large user base, and have therefore taken place mostly in the civil sector. In the defence sector, where the amounts of data that can be made available to developers are much more limited for confidentiality reasons, improvements have been more limited too, and progress is still needed to meet the requirements of most applications.
In order to address this issue of the lack of availability of operational data for system developers, a workaround is to resort to similar but sharable data. However, this generally implies the ad hoc creation of data, which can be relatively costly, and does not fully solve the issue. A better-suited solution would be that systems could learn directly from user data without disclosing any confidential information to developers. This would not only enable to make use of in-domain data hitherto unused and answer major security and sovereignty concerns, but also lead to significant performance progress thanks to a much more efficient use of the existing data. In addition, while there is already an increasing body of research on innovative approaches to machine learning related to this issue, such as semi- and self-supervised learning, active learning, transfer learning, and frugal learning, channelling these efforts through a technological challenge organisation has the potential to lead to a breakthrough. In this context, the call aims at creating not only generic systems that offer better performance for a wide range of conditions, but also systems that can be adapted by users to offer enhanced performances for specific applications.
The overarching goal of the call is to create a European library of generic and adaptive human language technology components that offer high performances for several defence applications. In particular, the technologies should be robust to noise and communication quality, cover a wide range of language and dialects including under-resourced ones, manage specific vocabulary, and offer more robust processing of high-level semantic information.
Scope :
The proposals should address technological solutions to process linguistic information in its different forms, i.e. spoken and written (handwriting, printed documents or typed text), in order to recognise, understand and translate it. These solutions should be evaluated in the framework of the technological challenge organised under this call topic. The proposals should in particular address the issue of user-driven system adaptation, i.e. the ability of systems to learn from user supervision without intervention from developers and without regression in terms of performances. Technologies should be integrated into demonstrators with user-friendly interfaces and be easy to integrate into other defence systems.
Types of activities
The following table lists the types of activities which are eligible for this topic, and whether they are mandatory or optional (see Article 10(3) EDF Regulation):
Types of activities
Eligible?
(art 10(3) EDF Regulation)
(a) Activities that aim to create, underpin and improve knowledge, products and technologies, including disruptive technologies, which can achieve significant effects in the area of defence (generating knowledge) Yes(mandatory)
(b) Activities that aim to increase interoperability and resilience, including secured production and exchange of data, to master critical defence technologies, to strengthen the security of supply or to enable the effective exploitation of results for defence products and technologies (integrating knowledge) Yes(mandatory)
(c) Studies, such as feasibility studies to explore the feasibility of new or upgraded products, technologies, processes, services and solutions Yes(optional)
(d) Design of a defence product, tangible or intangible component or technology as well as the definition of the technical specifications on which such a design has been developed, including any partial test for risk reduction in an industrial or representative environment Yes(optional)
(e) System prototyping of a defence product, tangible or intangible component or technology No
(f) Testing of a defence product, tangible or intangible component or technology No
(g) Qualification of a defence product, tangible or intangible component or technology No
(h) Certification of a defence product, tangible or intangible component or technology No
(i) Development of technologies or assets increasing efficiency across the life cycle of defence products and technologies No
The proposals must cover at least the following tasks as part of the mandatory activities:
* Generating knowledge:
+ research on human language technologies, including innovative approaches for user-driven system adaptation;
+ participation to the evaluation campaigns organised in the framework of the technological challenge, including:
o exchanging with other stakeholders on the evaluation plans;
o submission of systems to experimental performance measurements during the test campaigns managed by the challenge organisers;
o participation to debriefing workshops.
* Integrating knowledge:
+ integration of technological modules into demonstrators that can be tested by representative defence users.
The proposals should include clear descriptions of criteria to assess work package completion. Criteria should include the participation to the test campaigns organised in the framework of the technological challenge, and the delivery of descriptions of the systems submitted to the tests.
Functional requirements
The proposed solutions should fulfil the following requirements:
* Systems should be based on software components performing a variety of human language processing functions. These components should be integrated into demonstrators with a user-friendly interface. They should enable users to adapt the components using their own data, without intervention from the system developers.
* Systems should be able to run locally, without a connection to a wide area network, except for specific functions for which this can be duly justified and is compatible with operational missions (e.g. to achieve higher performances when adapting under user supervision).
* Systems should be optimised in term of memory and CPU footprint and more generally need reasonable resources in terms of hardware size, weight, price, and energy consumption, in view of their potential integration into existing or future larger defence systems.
* Systems should accept as input files linguistic information in its different forms:
+ speech (in audio files or in the audio stream of video files);
+ handwritten or printed documents;
+ text.
* For each of these forms, systems should accept a wide variety of possible inputs without necessarily having information about the specific type of input. Variability can be in terms of speakers or writers, speaking or writing style, vocabulary, accents, noise, recording or scanning conditions, transmission channels, etc. Systems should thus be speaker/writer-independent, channel-independent, robust to various accents, types of noise, etc. They should in particular be robust to conditions more frequently encountered in military environments (e.g. highly noisy, low communication quality, non-native speech, etc.).
* The scope of the human language processing functions should cover:
+ language identification from speech, documents and text;
+ speech recognition;
+ handwritten and printed document recognition;
+ keyword spotting from speech and handwritten or printed documents;
+ translation from speech, handwritten or printed documents, and texts;
+ high-level semantic (including military-specific) information extraction, such as named entity and event recognition, and relation extraction;
+ cross-language information retrieval;
+ automatic (multi-)document summarisation and visualisation.
* Software components corresponding to the above functions should cover multiple languages and dialects including EU and non-EU ones. Translation should cover all official EU languages as target languages. The proposals should mention the list of languages and dialects that are foreseen to be covered for each function.
* Software components should offer state-of-the-art performances. For each human language processing function mentioned above and for each language or dialect covered, the proposals should mention objectively measured performances (including information on the data and metrics used for the measurements, and if applicable on the evaluation campaign in the framework of which the measurements where made), and associated references.
* Software components corresponding to functions covered by the technological challenge organised in the framework of the call should be submitted for evaluation therein. Any difference between the version evaluated through the challenge and a version integrated in the demonstrator should be documented. How the proposed approaches and systems will address the tasks outlined in the preliminary evaluation plan (cf. Annex 4) should be described in the proposals. Components corresponding to functions that are not covered by the challenge may also be adapted and enhanced during project execution if deemed appropriate, possibly using data from the challenge.
* The software components should be easy to configure and integrate into defence systems beyond the demonstrators produced in the framework of the challenge. They should follow as much as possible the relevant standards, best practices and guidelines, including those elaborated at the challenge level, in particular for input and output formats.
* Systems and user interfaces should help users as much as possible to understand how the outputs are derived from the inputs (explainable AI), and in particular provide links to the inputs. For example, translations should be accompanied by links to the source language at the level of the term or phrase, and visualisations should include links to the inputs that support the displayed information. Knowledge that is instrumental in determining the output or that can help users making more sense of the inputs, such as bilingual dictionary entries for translation, should also be provided.
* Systems should enable users to adapt them using their own data, e.g. by providing batches of raw or annotated data, or by interactively providing supervision.
Expected Impact :
The outcome should contribute to:
* a strengthened EDTIB (European Defence Technological and Industrial Base) and enhanced technological autonomy for defence-oriented HLT systems;
* a broader, cheaper and easier usage of HLT systems for defence;
* enhanced defence systems in various domains, in particular C4ISR;
* enhanced EU freedom of action.
- Admissibility conditions: described in section 5 of the call document
Proposal page limits and layout: described in Part B of the Application Form available in the Submission System
Eligible countries: described in section 6 of the call document
Other eligibility conditions: described in section 6 of the call document
Financial and operational capacity and exclusion: described in section 7 of the call document
Evaluation and award:
Award criteria, scoring and thresholds: described in section 9 of the call document
Submission and evaluation processes: described section 8 of the call document and the Online Manual
Indicative timeline for evaluation and grant agreement: described in section 4 of the call document
- Legal and financial set-up of the grants: described in section 10 of the call document
Call document s :
Templates for proposals should be downloaded from the Submission System (available at the opening of the call), the links below are examples only:
- EDF Standard a pplication f orm
- Detailed budget table EDF LS RA
- Participant information (including previous projects, if any)
- List of infrastructure, facilities, assets and resources
- Actual indirect cost methodology declarations (if actual indirect costs used)
- Ownership control declarations EDF General MGA v1.0
Additional documents:
Generic Programme Security Instruction (PSI) concerning European Defence Fund
EU Financial Regulation 2018/1046 Rules for Legal Entity Validation, LEAR Appointment and Financial Capacity Assessment
EU Grants AGA — Annotated Model Grant Agreement
Funding & Tenders Portal Online Manual