Foundations of AI and Data Science

Foundations of AI and Data Science is a critical area of research at CAIDAS that explores the underlying principles and techniques behind the development of AI and data science applications. This area encompasses several sub-disciplines, each of which focuses on a specific aspect of AI and data science research. The sub-areas of Foundations of AI and Data Science include Deep Learning, Representation Learning, Reinforcement Learning, Statistical Relational Learning, Machine Learning for Complex Networks, Computer Vision, Natural Language Processing, and Pattern Recognition.

Deep Learning focuses on the development of algorithms that enable artificial neural networks to learn from large amounts of data. These algorithms allow AI systems to improve their performance over time and can be applied to various applications, including image recognition, speech recognition, and natural language processing.

Representation Learning deals with the development of algorithms that can effectively represent complex data structures. These algorithms are critical for building AI systems that can learn from structured and unstructured data and can be used to improve the performance of applications such as computer vision and natural language processing.

Reinforcement Learning focuses on the development of algorithms that allow AI systems to learn from experience. These algorithms are used in applications that require the AI system to make decisions based on the consequences of its actions, such as robotics, gaming, and autonomous driving.

Statistical Relational Learning deals with the development of algorithms that can effectively model relationships between entities in data. These algorithms can be used to improve the performance of applications that require the analysis of complex, relational data, such as knowledge graphs and social networks.

Machine Learning for Complex Networks focuses on the development of algorithms that can effectively analyze complex networks of data, such as those found in social networks, transportation networks, and biological networks.

Computer Vision deals with the development of algorithms that can analyze and understand images and videos. These algorithms can be used in applications such as object recognition, face recognition, and scene analysis.

Natural Language Processing deals with the development of algorithms that can analyze and understand human language. These algorithms can be used in applications such as speech recognition, sentiment analysis, and machine translation.

Pattern Recognition deals with the development of algorithms that can recognize patterns in data. These algorithms can be used in applications such as audio data recognition in ecology, image classification, and speech recognition.

In conclusion, the area of Foundations of AI and Data Science at CAIDAS is a critical component of AI and data science research. It encompasses a wide range of sub-disciplines that each contribute to the advancement of AI and data science applications.

Research Areas

The area of Deep Learning within CAIDAS research focuses on the development and advancement of algorithms and techniques that enable machine learning models to learn and make decisions based on large and complex data. The goal is to build models that can simulate human decision-making and pattern recognition abilities, allowing them to perform tasks such as image and speech recognition, natural language processing, and more.

Deep learning models are inspired by the structure and function of the human brain and are based on artificial neural networks. These models consist of layers of interconnected nodes, or artificial neurons, which are trained to recognize patterns in data. The more data the model is exposed to, the more accurate it becomes in making predictions and classifications.

Deep learning techniques are used in many areas of AI and data science and are particularly well-suited for tasks where large amounts of data are involved, such as image and speech recognition, natural language processing, and more. In these tasks, deep learning models can achieve superior performance compared to traditional machine learning techniques.

Another sub-area that is closely related to deep learning is representation learning, which deals with the development of methods for learning useful representations of data that can be used for various tasks. Representation learning is a key component in deep learning, as the representations learned by deep learning models play a critical role in their ability to make accurate predictions and classifications.

In the context of CAIDAS research, the area of deep learning is a central and interdisciplinary field, connecting many other sub-areas such as computer vision, natural language processing, and machine learning for complex networks. The focus is on the development of new algorithms and techniques that can improve the performance and scalability of deep learning models, as well as the exploration of new applications and domains where these models can be applied.

Representation Learning is a sub-area within the Foundations of AI and Data Science that focuses on the development of methods and techniques for learning the underlying representations of data. The goal of representation learning is to extract meaningful and useful representations of data that can be used for a variety of tasks, such as classification, clustering, and dimensionality reduction.

In the context of CAIDAS and its research, representation learning plays a crucial role in many of the application pillars. For example, in the area of computer vision, representation learning can be used to extract meaningful features from images, such as edges, corners, and shapes, that can be used for tasks such as object recognition. In natural language processing, representation learning can be used to extract semantic representations of words and phrases that can be used for tasks such as sentiment analysis and machine translation.

Representation learning can be performed in a variety of ways, including unsupervised, supervised, and semi-supervised learning. Unsupervised representation learning algorithms, such as auto-encoders and generative adversarial networks, learn representations by maximizing a reconstruction loss or by matching the distribution of the learned representations to the distribution of the input data. Supervised representation learning algorithms, such as multi-layer perceptrons and convolutional neural networks, learn representations by minimizing a prediction loss, such as cross-entropy, using labeled data. Semi-supervised representation learning algorithms, such as self-supervised learning, combine aspects of both unsupervised and supervised learning by using a limited amount of labeled data and a large amount of unlabeled data.

Overall, representation learning is an important area of research within CAIDAS as it provides the foundation for many of the application pillars and has numerous real-world applications. By developing methods and techniques for learning representations that are meaningful, robust, and scalable, CAIDAS researchers aim to advance the field of AI and data science and to create new opportunities for innovation.

Enabling agents to intelligently interact with the world around them is a long-standing goal of AI research. Reinforcement learning is a framework that aims at solving this problem by providing a mathematical model of the interaction between an agent and its environment. Agents can execute actions on the world and observe their effect, which is then subsequently considered for improving their actions.
Through this flexible model, reinforcement learning complements the more classical approaches of optimal control and decision-making, by providing effective algorithms for obtaining extremely complex behavior, especially after the combination with deep learning techniques that enabled to tackle problems that were considered unsolvable only a few years ago.
Research in reinforcement learning is very active and constantly addressing important open questions, paving the way towards the intelligent autonomous agents of the future.

Relational data mining techniques play an important role in many disciplines, such as information science, sociology, bioinformatics or economics. They provide new ways to explore large corpora of data which capture dyadic relationships, interactions or links between documents, humans, genes, or financial institutions. However, we now increasingly have access to complex data that capture more than just dyadic relations. Examples include multi-relational data, time-stamped relations, relational data with noise, or sequential data. The question when a graph abstraction of such complex relational data is justified has not been answered satisfactorily. To address this problem, we develop new algorithmic and statistical data mining techniques for relational data with complex characteristics.

Graph analytics and (social) network analysis have become cornerstones of data science. They are widely applied to relational data studied in disciplines such as computer science, physics, systems biology, social science or economics. However, we are increasingly confronted with high-frequency, time-resolved data which not only tell us who is related to whom, but also when and in which sequence these relations occurred. The analysis of such data is still a challenge. A naive application of network analysis and modeling techniques discards information on the timing and ordering of relations, which is the foundation of so-called causal or time-respecting paths, i.e. it is needed to answer the question who can influence whom.

Addressing the problem that common graphical representations of relational data discard information on the temporal ordering of relations, at CAIDAS we developed new machine learning techniques based on higher-order graphical models. Extending the common network perspective, it allows to combine information on both topological and temporal characteristics of time-resolved relational data into compact graphical models. This approach provides new ways to (i) model dynamical processes like diffusion, cascades or epidemic spreading, (ii) detect temporal-topological clusters based on higher-order Laplacians and spectral methods, (iii) assess the importance of nodes, and (iv) study the controllability of complex systems. This research aims at methodological advances which not only provide us with novel data mining techniques, but whose impact reaches beyond computer science, with applications in the modeling of complex systems in physics, systems biology, social science and economics.

The area of Computer Vision within the context of CAIDAS research is focused on the development of machine learning algorithms and techniques that enable computers to process, understand, and interpret visual information. This field is concerned with how to extract information from digital images and video data, and how to use that information to make sense of the world. In the context of CAIDAS, researchers in this area are working to advance the state of the art in computer vision, with a focus on developing techniques that can be applied to a wide range of real-world problems.

One key area of focus within computer vision research at CAIDAS is object detection and recognition. Researchers are exploring ways to develop algorithms that can accurately identify objects within images and video, and to classify those objects into different categories. This is a critical capability for a wide range of applications, from security and surveillance, to autonomous driving, to retail and advertising.

Another important area of focus within computer vision at CAIDAS is image processing and analysis. Researchers are working to develop algorithms that can process large amounts of visual data and extract relevant information, such as edges, corners, and other features. These algorithms are used to support a wide range of applications, including object tracking, motion analysis, and medical image analysis.

The field of computer vision is also closely tied to other areas of AI and data science, such as deep learning and natural language processing. At CAIDAS, researchers are exploring ways to integrate these different fields, and to develop algorithms that can process and understand complex visual information in real-time.

Overall, the area of Computer Vision within CAIDAS research is dedicated to advancing our understanding of how computers can process and understand visual information, and to developing new techniques and algorithms that can be used in a wide range of real-world applications. Whether it's improving the accuracy of object recognition, or developing new ways to process and analyze images, CAIDAS researchers are at the forefront of this exciting field, working to unlock the full potential of computer vision.

Knowledge-enriched NLP focuses on models that combine unstructured information, such as text corpora, with structured information or explicitly represented knowledge, typically in the form of knowledge graphs or ontologies. Typical research questions are as follows:

Can knowledge be automatically extracted from learned language models to improve existing knowledge graphs?
How can knowledge graphs best be represented for integration into state-of-the-art NLP language models?
Can weakly structured information be used to improve existing Knowledge graphs but also language models and tasks?

Knowledge-enriched NLP is closely related to other sub-areas such as Representation Learning and Deep Learning, and can benefit from advancements in graph-based algorithms and machine learning for complex networks. CAIDAS researchers aim to contribute to the advancement of NLP by incorporating structured knowledge and to real-world applications of the field.

Pattern Recognition is an important sub-area of the foundations of AI and Data Science research at CAIDAS. It deals with the automatic classification and categorization of data into meaningful patterns or categories. This area of research is closely linked to other sub-areas such as Machine Learning, Computer Vision, and Natural Language Processing, as pattern recognition is a crucial component of many AI and data science applications.

The main goal of pattern recognition is to develop algorithms and models that can automatically identify patterns in data, without human intervention. These algorithms take raw data as input, and then identify meaningful patterns or relationships that exist within the data. This information can then be used to make predictions, classify new data, or perform other data-driven tasks.

CAIDAS researchers work on a range of pattern recognition problems, including image classification, speech recognition, text classification, and bio-medical signal analysis, among others. They employ a variety of techniques, including statistical methods, machine learning algorithms, and deep learning techniques, to build robust and accurate pattern recognition models.

The applications of pattern recognition are numerous, ranging from image and speech recognition in consumer electronics to medical diagnosis and predictive maintenance in industrial settings. For example, in computer vision, pattern recognition algorithms can be used to automatically identify objects in images, classify images into different categories, and recognize facial features for biometric authentication. In natural language processing, pattern recognition techniques can be used to identify sentiment in text, classify text into different categories, and perform machine translation.

Foundations of AI and Data Science

Research Areas

Principial Investigators

Radu Timofte

Goran Glavaš

Andreas Hotho

Frank Puppe

Christof Weiß

Carlo D'Eramo

Ingo Scholtes

Damien Garreau

Leon Bungert

Katharina Breininger

Adrian Krenzer

Sabine Fischer

N.N.

Bildnachweise