Ready-made AI solutions are not practicable for each and every technical application. Often, AI engineering is necessary to adapt the architecture of the respective machine learning model or its utilisation to the specific demands of the application. This can lead to patentable inventions under the rules and case law applicable at the EPO.
Examples of patentable automotive technology solutions under the EPC
Compared with generative AI such as large language models (LLMs) for generating and editing text, and AI models for generating images and videos, AI for use in vehicles or other mobile environments must run on a comparatively spartanly equipped hardware platform. Power consumption, heat generation, overall size, and, last but not least, the cost of the corresponding control units are all relatively constrained.
It is therefore desirable to simplify the architecture of a neural network while maintaining performance to the greatest extent possible. For example, weights and other parameters of the network can be quantised to one of only a limited number of possible discrete values. Individual neurons or whole parts of the network that are deemed less important can be pruned. Additionally, certain neurons can be randomly deactivated during runtime using a process known as ‘dropout’. A network can be composed of several parts in a ‘mixture of experts’, of which only the most competent part for each input becomes active.
The more AI is involved in important decisions, the more important reliability and security become. Many AI models are by default a ‘black box’ whose processing steps from input to end result are difficult to track. ‘Explainable AI’ therefore aims to make it easy to understand exactly what an AI model bases its decision on. For example, a ‘saliency map’ can show which areas of an image were crucial for the decision of an image classifier.
In this context, it is also important to know how quickly the decision can ‘flip’ if the input is changed. For example, manipulating a stop sign with a sticker that is barely noticeable to the human eye can cause a traffic sign recognition system to recognise a completely different traffic sign and to disregard the stop sign. The AI model must therefore be made resistant to such ‘adversarial examples’ or at least be able to recognise them.
If the data processed by AI is measurement data, this may be affected by noise or by other uncertain signals. In order to address and quantify these uncertain signals, processing can be designed to be entirely or partially probabilistic. That is, instead of specific values for variables, parameters that characterise the distribution functions of these variables are calculated. Concrete values for the variables can then be drawn from these distribution functions.
To prevent small uncertain signals from steering the processing in completely different, surprising directions, batch normalisation and other regularisation methods can also be used.
When processing data with AI, the task of separating important information from unimportant information often arises. Autoencoder architectures can be used in such instances. These architectures encode the input into a representation in a latent space, the dimensionality of which is usually drastically reduced compared with the input. A decoder is then used to reconstruct the original input, or another variable of interest, from this latent code. The low dimensionality of the representation forces the input through an ‘information bottleneck’. The encoder thus learns to extract the most important information from the input. The decoder learns to reconstruct a maximum from the little information it receives.
Training, data, and model adaptation
Language models – in particular, LLMs – are increasingly used as a universal tool to find, based on a large amount of existing knowledge, as well founded an answer to a question as possible. In such instances, the existing knowledge can be incorporated in various ways. For example, a generically pre-trained LLM can be further trained with the existing knowledge (‘fine-tuning’). The existing knowledge can be fed into the LLM as additional contextual information, in light of which the answer is to be formulated. Furthermore, the LLM can be given the opportunity to query the existing knowledge itself from a database or other storage (‘retrieval augmented generation’).
Another factor that plays a role in the engineering of AI applications is the question of where to obtain the necessary training data. Particularly in supervised training, where the output of the AI model is compared with a target output (‘ground truth label’) and this output is evaluated using a cost function (‘loss function’), the ‘labelling’ of training examples with the corresponding target outputs is often a manual and time-consuming process. Therefore, data augmentation is used to generate variations of the training examples for which a previously assigned ground truth label is still valid, so that no additional effort is required to label these new training examples.
Using various domain transfer methods, training examples from a specific domain (such as images taken in summer or during the day) can be transferred to another domain (such as images taken in winter or at night) while retaining the respective semantic content (e.g., road users, roads, or other object instances). Generative adversarial networks (GANs), such as CycleGAN, are important tools for such domain transfer.
If ground truth labels are obtained on an ad hoc basis during training, ‘active learning’ can be used to specifically select those training examples whose labels promise the greatest learning success. Comprehensive training on training examples with sufficient variability reduces the likelihood that, when the AI model is later applied, a specific input will suddenly prove to be ‘out-of-distribution’ and may not be handled correctly.
Training cannot always be provided by a single entity. For one, the resources required are often too large, and for another, the training examples in many applications also contain personal data, such as recognisable faces or licence plates. Here, collection by a central entity is subject to legal restrictions, especially when transmitting data across national borders. The ‘federated learning’ approach enables training carried out by many participants with a locally available part of the training examples, such that thereafter only contributions to the training success, but not the training examples themselves, need to be collected centrally.
AI models, such as neural networks, do not always have to be retrained from scratch. They can also be trained to adopt all or part of the knowledge embodied in an existing model; for example, through ‘transfer learning’ or a ‘teacher–student’ approach. With such approaches, for example, a ‘student’ network with a smaller architecture that is to be used in a vehicle can adopt just enough knowledge from a much larger ‘teacher’ network to enable it to perform the upcoming driving task.
Patentable AI applications: final thoughts
The above-mentioned innovative aspects of AI engineering for exemplary automotive applications require a detailed analysis of the technical features to render the specific AI models patentable under the rules and case law of the EPO. An optimal patent protection in the scope of the European Patent Convention often requires a deep understanding of the technical complexity of such innovations in automotive engineering and elsewhere.