Less training data, higher accuracy

Multitasking & Multimodal AI

Deep learning yields great results across many fields, but for each problem, getting a deep model to work well involves research into the architecture and a long period of tuning. Our technologies yield accurate results on a huge number of problems spanning multiple domains: this is possible thanks to Multi Task and Multimodal learning approaches, that let us build general purpose intelligence systems easy adaptable to specific domains.

But still AI applications need human supervision that are difficult, time-consuming and expensive. We use active learning techniques to let our algorithms perform better with less training data, allowing them to choose the data from which they learn.

Thanks to this powerful approach, we have been able to build our core products that enable us to deal with text, audio and images. Document Element Recognition, Speech Element Recognition and Image Element Recognition are easy adaptable and interoperable solutions already applied in a number of different domains and applications.

Moreover, the suitable principle of Aptus.AI applies also to the target infrastructure and demand: our products are production-ready, interoperable, scalable and cross-cloud artificial intelligence solutions that can be easily integrated in your next cognitive application.




We process textual data using state of the art Natural Language Understanding algorithms to extract any kind of information.
We already have 39 models that solve different tasks inĀ 24 languages for 7 different domains.

Supported languages Italian, English, French, German, Arabic, Chinese, Russian, Spanish, Dutch, Portuguese, Swedish, Finnish, Norwegian, Danish, Hungarian, Polish, Romanian, Croatian, Turkish, Hebrew, Persian, Hindi, Indonesian, Japanese
Domains Medical, Legal, News, Social Media, Finance, Academia, Government
Solved tasks Language Identification
Part of Speech Tagging
Labelled Dependency Parsing
Text Classification
Sentiment Analysis & Irony Detection
Author Profiling
Named Entity Recognition
Keyword Extraction
Automatic Summarization
Semantics Driven Text Segmentation
Structured Reporting



We identify and classify any kind of audio and video and perform Speech Recognition and Speaker profiling.
We already have 7 models performing different tasks.

Supported languages Italian, English, Portuguese
Solved tasks Events Recognition (527 classes of sound including music, vehicle, speech, train, planes etc)
Speaker Identification
Speaker Profiling
Speaker Emotion Detection
Speaker Diarization



We analyze images, extracting different kind of elements including objects, shapes, faces and environmental characteristics,
together with their attributes.
We already have 5 models performing different tasks.

Solved tasks Object Detection
Face Recognition
Face Profiling
Image Classification
Medical Exams Image Analysis

Let's mix it up

Thanks to Multimodal Learning approach we can combine our products together to solve complex tasks and deal with different kind of input data. This lead us to build smarter algorithms that can share information and give feedback to each other. The resulting models are able to extract any information element
from your unstructured data.

Document Element Recognition

Speech Element Recognition

Image Element Recognition

Any Element Recognition

Our combined solution that solve your complex task.

Deal with different inputs and tasks let our neural networks to share information making them smarter.

This is Aptus.AI


Want to try out our powerful products solving your business problem? A team of
experts is ready to find out the best solution that solve your task.
Tell us what you need, our algorithm will do the rest.

