Publications of Andreas Holzinger > Scholar, DBLP, ORCID

Andreas Holzinger build a track record in AI/Machine Learning (see definition). He has been working on integrated machine learning, which is manifested in Holzinger’s HCI-KDD approach. This approach is based on a synergetic combination of two different fields to understand context and is now very important for what is called explainable-AI (xAI) and interpretable ethical responsible machine learning: Human–Computer Interaction (HCI), rooted in cognitive science, particularly dealing with human intelligence, and Knowledge Discovery/Data Mining (KDD), rooted in computer science, particularly dealing with artificial intelligence. This  approach is the basis for Human-Centered AI (HCAI) in general and Explainability and Causability in particular. Andreas has pioneered the interactive machine learning approach with a human-in-the-loop. Andreas proved his concept with his glass-box approach, which becomes now important due to the raising ethical, social, and legal issues governed e.g., by the European Union. It will become important to make decisions transparent, retraceable and human interpretable so to explain why a machine decsion has been made. “The why” is often more imporant than a pure classification result.

Subject: Computer Science > Artificial Intelligence (102001)
Technical Area: Machine Learning (102019)
Application Area: Health Informatics (102020)
Keywords: Human-Centered AI, Explainable AI, ethical responsible Machine Learning, interactive Machine Learning (iML), Decision Support Systems, Intelligent User Interfaces

Publication metrics as of 26.09.2019 14:00 MST:

Google Scholar citations: 12,005, Google Scholar h-Index: 51
Scopus h-Index = 35, Scopus citations = 5510
DBLP Peer-reviewed conference papers = 179, Peer-reviewed journal papers = 73

Measuring the Quality of Explanations: The Systems Causability Scale (SCS). Comparing Human and Machine Explanations.

Andreas Holzinger, Andre Carrington & Heimo Müller 2020. Measuring the Quality of Explanations: The System Causability Scale (SCS). Comparing Human and Machine Explanations. KI – Künstliche Intelligenz (German Journal of Artificial intelligence), Special Issue on Interactive Machine Learning, Edited by Kristian Kersting, TU Darmstadt, 34, (2), doi:10.1007/s13218-020-00636-z., online available via

In this paper we introduce our System Causability Scale (SCS) to measure the quality of explanations. It is based on the notion of Causability (Holzinger et al., 2019) combined with concepts adapted from the widely accepted System Usability Scale (SUS). In the same way that usability measures the quality of use, causability measures the quality of explanations.

Causability and Explainability of Artificial Intelligence in Medicine

Andreas Holzinger, Georg Langs, Helmut Denk, Kurt Zatloukal & Heimo Mueller 2019. Causability and Explainability of AI in Medicine. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, doi:10.1002/widm.1312 

In this paper we introduce the notion of Causability, which is extending explainability and is of great importance for future Human-AI interfaces (see our paper on dialogue systems for intelligent user interfaces). Such interfaces for explainable AI have to map the technical explainability (which is a property of AI, e.g. the heatmap of a neural network produced by e.g. layer wise relevance propagation) with  causability (which is a property of a human, i.e. the extent to which the technical explanation is interpretable by the end user) and to answer questions of why we need a ground truth, i.e. a framework for understanding.

KANDINSKY Patterns: A Swiss-Knife for the Study of Explainable AI

Andreas Holzinger, Peter Kieseberg & Heimo Müller 2020. KANDINSKY Patterns: A Swiss-Knife for the Study of Explainable AI. ERCIM News, (120), 41-42. [pdf, 755 KB] Online available:  Kandinsky Patterns enable testing, benchmarking and evaluating machine learning algorithms under mathematically strictly controllable conditions, but at the same time are accessible and understandable for human observers and with the possibility to produce (and hide) a ground truth. This will be extremely important in the future, as adversarial examples have already demonstrated their potential in attacking security mechanisms applied in various domains, especially medical environments. Last, but not least, Kandinsky Patterns can be used to produce “counterfactuals” – the “what if”, which is difficult to handle for both humans and machines – but can provide new insights into the behaviour of explanation methods.

A new concordant partial AUC and partial c statistic for imbalanced data in the evaluation of machine learning algorithms

André M. Carrington, Paul W. Fieguth, Hammad Qazi, Andreas Holzinger, Helen H. Chen, Franz Mayr & Douglas G. Manuel 2020. A new concordant partial AUC and partial c statistic for imbalanced data in the evaluation of machine learning algorithms. Springer/Nature BMC Medical Informatics and Decision Making, 20, (1), 4, doi:10.1186/s12911-019-1014-6.

In explainable AI a very important issue is robustness of machine learning algorithms. We introduce a novel concordant partial Area Under the Curve (AUC) and a new partial c statistic for Receiver Operator Characteristic (ROC) dataas foundational measures to help to understand and to explain ROC and AUC. Our partial measures are continuous and discrete versions of the same measure, are derived from the AUC and c statistic respectively, are validated as equal to each other, and validated as equal in summation to whole measures where expected. 

KANDINSKY Patterns as Intelligence Test for machines

Andreas Holzinger, Michael Kickmeier-Rust & Heimo Mueller 2019. KANDINSKY Patterns as IQ-Test for machine learning. Springer Lecture Notes LNCS 11713. Cham (CH): Springer Nature Switzerland, pp. 1-14, doi:10.1007/978-3-030-29726-8_1 .

AI follows the notion of human intelligence which is not a clearly defined term, according to cognitive science includes abilities to think abstract, to reason, and to solve problems from the real world. A hot topic in current AI/machine learning research is to find whether and to what extent algorithms are able to learn abstract thinking and reasoning similarly as humans can do, or whether the learning remains on purely statistical correlations. In this paper we propose to use our Kandinsky Patterns as an IQ-Test for machines and to study concept learning which is a fundamental problem for future AI/ML. [Paper] [Conference Slides] [exploration enviroment] [TEDx]


Dialogue Systems for Intelligent Human Computer Interactions

Erinc Merdivan, Deepika Singh, Sten Hanke & Andreas Holzinger 2019. Dialogue Systems for Intelligent Human Computer Interactions. Electronic Notes in Theoretical Computer Science, 343, 57-71, doi:10.1016/j.entcs.2019.04.010.

Online available via:

In this paper we present some fundamentals on communication techniques for interaction in dialogues involving speech, gesture, semantic and pragmatic knowledge and present a new image-based method in an Out Of Vocabulary setting. The results show that using dialogue as an image performs well and helps dialogue manager in expanding out of vocabulary dialogue tasks in comparison to Memory Networks. This is important for future Human-AI interfaces.

The first publication on our KANDINSKY Universe

Heimo Müller & Andreas Holzinger 2019. Kandinsky Patterns. arXiv:1906.00657

KANDINSKY Figures and KANDINSKY Patterns are mathematically describable, simple, self-contained, hence controllable test data sets for the development, validation and training of explainability/interpretability in artificial intelligence (AI) and machine learning (ML). While they possess these computationally manageable properties, they are at the same time easily distinguishable by human observers, so can be described by both humans and algorithms. We invite the international machine learning research community to a challenge to experiment with our Kandinsky Patterns to expand and thus make progress in the field of explainable AI and to contribute to the upcoming field of explainability and causability. [Project Page]

Interactive machine learning: experimental evidence for the human in the algorithmic loop: A case study on Ant Colony Optimization

Andreas Holzinger, Markus Plass, Michael Kickmeier-Rust, Katharina Holzinger, Gloria Cerasela Crişan, Camelia-M. Pintea & Vasile Palade 2019. Interactive machine learning: experimental evidence for the human in the algorithmic loop. Applied Intelligence, 49, (7), 2401-2414, doi:10.1007/s10489-018-1361-5. Online available:

In this paper we provide novel experimental insights on how we can improve computational intelligence by complementing it with human intelligence in an interactive machine learning approach (iML). For this purpose, we used the Ant Colony Optimization (ACO) framework, because this fosters multi-agent approaches with human agents in the loop. We propose unification between the human intelligence and interaction skills and the computational power of an artificial system.

Visual analytics for concept exploration in subspaces of patient groups: Making sense of complex datasets with the Doctor-in-the-loop

Michael Hund, Dominic Boehm, Werner Sturm, Michael Sedlmair, Tobias Schreck, Torsten Ullrich, Daniel A. Keim, Ljiljana Majnaric & Andreas Holzinger 2016. Visual analytics for concept exploration in subspaces of patient groups: Making sense of complex datasets with the Doctor-in-the-loop. Brain Informatics, 3, (4), 233-247, doi:10.1007/s40708-016-0043-5. Online available:

In this paper, which is another proof for the human-in-the-loop concept, we present SubVIS, an interactive tool to visually explore subspace clusters from different perspectives, introduce a novel analysis workflow, and discuss future directions for high-dimensional (medical) data analysis and its visual exploration.


Interactive Machine Learning (iML) for health informatics: When do we need the human-in-the-loop ?

Andreas Holzinger 2016. Interactive Machine Learning for Health Informatics: When do we need the human-in-the-loop? Brain Informatics, 3, (2), 119-131, doi:10.1007/s40708-016-0042-6. Online available:

In this highly-cited paper we define iML as ‘‘algorithms that can interact with agents and can optimize their learning behaviour through these interactions, where the agents can also be human.’’ This ‘‘human-in-the-loop’’ (or a crowd of humans) can be beneficial in solving computationally hard problems, e.g., subspace clustering, protein folding, or k-anonymization, where human expertise can help to reduce an exponential search space through heuristic selection of samples through  a glass-box approach. Ultimately, this fosters explainability and verifiability, because a human is able re-trace and thus understand the underlying factors of why a certain decision has been made, but at the same time is able to re-enact and to verify the results.

Biomedical image augmentation using Augmentor

Marcus D. Bloice, Peter M. Roth & Andreas Holzinger 2019. Biomedical image augmentation using Augmentor. Bioinformatics, 35, (1), Oxford Academic Press, 4522-4524, doi:10.1093/bioinformatics/btz259.
Online available:

In this paper we present the Augmentor software package for image augmentation. It provides a stochastic, pipeline-based approach to image augmentation with a number of features that are relevant to biomedical imaging, such as z-stack augmentation and randomized elastic distortions. The software has been designed to be highly extensible meaning an operation that might be specific to a highly specialized task can easily be added to the library, even at runtime. There are two versions available, one in Python and one in Julia.

Why imaging data alone is not enough: AI-based integration of imaging, omics, and clinical data
Why imaging data alone is not enough: AI-based integration of imaging, omics, and clinical data

Andreas Holzinger, Benjamin Haibe-Kains & Igor Jurisica 2019. Why imaging data alone is not enough: AI-based integration of imaging, omics, and clinical data. European Journal of Nuclear Medicine and Molecular Imaging, 46, (13), 2722-2730, doi:10.1007/s00259-019-04382-9. Integration of clinical, imaging, molecular data is necessary to understand complex diseases, and to achieve accurate diagnosis to provide the best possible treatment. In addition to the need for sufficient computing resources, suitable algorithms, models, and data infrastructure, three important aspects are often neglected: (1) the need for multiple independent, sufficiently large and, above all, high-quality data sets; (2) the need for domain knowledge and ontologies; and (3) the requirement for multiple networks that provide relevant relationships among biological entities. While one will always get results out of high-dimensional data, all three aspects are essential to provide robust training and validation of ML models, to provide explainable hypotheses and results, and to achieve the necessary trust in AI and confidence for clinical applications. [Preprint available here

Human Activity Recognition Using Recurrent Neural Networks

Deepika Singh, Erinc Merdivan, Ismini Psychoula, Johannes Kropf, Sten Hanke, Matthieu Geist & Andreas Holzinger 2017. Human Activity Recognition Using Recurrent Neural Networks. In: Lecture Notes in Computer Science LNCS 10410. Cham: Springer International, pp. 267-274, doi:10.1007/978-3-319-66808-6_18. In this paper, we introduce a deep learning model that learns to classify human activities without using any prior knowledge. For this purpose, a Long Short Term Memory (LSTM) Recurrent Neural Network was applied to three real world smart home datasets. The results of our experiments show that the proposed approach outperforms existing in terms of accuracy and performance. Human activity recognition using smart home sensors is one of the bases of ubiquitous computing in smart environments and a topic undergoing intense research in the field of ambient assisted living. The increasingly large amount of data sets calls for machine learning methods.

Augmenting Statistical Data Dissemination by Short Quantified Sentences of Natural Language
Augmenting Statistical Data Dissemination by Short Quantified Sentences of Natural Language

Miroslav Hudec, Erika Bednárová & Andreas Holzinger 2018. Augmenting Statistical Data Dissemination by Short Quantified Sentences of Natural Language. Journal of Official Statistics (JOS), 34, (4), 981, doi:10.2478/jos-2018-0048. Online available:

In this paper we study the potential of natural language summaries expressed in short quantified sentences. Linguistic summaries are not intended to replace existing dissemination approaches, but can augment them by providing alternatives for the benefit of diverse users (e.g. domain experts, general public, disabled people, …). The concept of lingusitic summaries is demonstrated on test interfaces, which can be important for future human-AI dialogue systems.

Computational approaches for mining user’s opinions on the Web 2.0

Gerald Petz, Michał Karpowicz, Harald Fuerschuss, Andreas Auinger, Vaclav Stritesky & Andreas Holzinger 2014. Computational approaches for mining user’s opinions on the Web 2.0. Information Processing & Management, 50, (6), 899-908, doi:10.1016/j.ipm.2014.07.005. Computational opinion mining discovers, extracts and analyzes people’s opinions, attitudes and emotions towards certain topics in social media. While providing interesting market research information, the user generated content presents numerous challenges regarding systematic analysis, the differences and unique characteristics of the various social media channels. Here we reports on the determination of such particularities, and deduces their impact on text preprocessing and opinion mining algorithms.

Explainable AI: The New 42?

Randy Goebel, Ajay Chander, Katharina Holzinger, Freddy Lecue, Zeynep Akata, Simone Stumpf, Peter Kieseberg & Andreas Holzinger 2018. Explainable AI: the new 42? Springer Lecture Notes in Computer Science LNCS 11015. Cham: Springer, pp. 295-303, doi:10.1007/978-3-319-99740-7_21. In this 2018 output of our yearly xAI-workshop at the CD-MAKE conference we discuss some issues of the current state-of-the-art in what is now called explainable AI and outline what we think is the next big thing in AI/machine learning: the combination of statistical probabilistic machine learning methods with classic logic based symbolic artificial intelligence. Maybe the field of explainable ai can act as an ideal bridge to combine these two worlds. [pdf, 875 kB]

Biomedical image augmentation using Augmentor

Marcus D. Bloice, Peter M. Roth & Andreas Holzinger 2019. Biomedical image augmentation using Augmentor. Oxford Bioinformatics, 35, (1), 4522-4524, doi:10.1093/bioinformatics/btz259.

Within our Augmentor project aiming to improve model accuracy, generalisation, and to control overfitting, we developed Augmentor, a software package, available in both Python and Julia versions, that provides a high level API for the expansion of image data using a stochastic, pipeline-based approach which effectively allows for images to be sampled from a distribution of augmented images at runtime. Augmentor provides methods for most standard augmentation practices as well as several advanced features such as label-preserving, randomised elastic distortions, and provides many helper functions for typical augmentation tasks used in machine learning.  Online available:

Interpretierbare KI: Neue Methoden zeigen Entscheidungswege künstlicher Intelligenz auf

Andreas Holzinger 2018. Interpretierbare KI: Neue Methoden zeigen Entscheidungswege künstlicher Intelligenz auf. c’t Magazin für Computertechnik, 22, 136-141.

Machine Learning erzeugt heute KI-Systeme, die Entscheidungen schneller treffen als ein Mensch. Darf dieser sich aber entmündigen lassen? Neue Methoden machen Entscheidungswege transparent und nachvollziehbar und schaffen
damit Vertrauen und Akzeptanz – oder sie decken Missverständnisse auf.  [pdf, 871 kB]

Online verfügbar:

Explainable AI

Andreas Holzinger 2018. Explainable AI (ex-AI). Informatik-Spektrum, 41, (2), 138-143, doi:10.1007/s00287-018-1102-5. ,,Explainable AI“ ist kein neues Gebiet. Das Problem der Erklärbarkeit ist so alt wie die AI selbst, ja vielmehr das Resultat ihrer selbst. Während regelbasierte Lösungen der frühen AI nachvollziehbare ,,Glass-Box“-Ansätze darstellten, lag deren Schwäche im Umgang mit Unsicherheiten der realen Welt. Durch die Einführung probabilistischer Modellierung und statistischer Lernmethoden wurden die Anwendungen zunehmend erfolgreicher – aber immer komplexer und opak. Beispielsweise werden Wörter natürlicher Sprache auf hochdimensionale Vektoren abgebildet und dadurch für Menschen nicht mehr verstehbar. In Zukunft werden kontextadaptive Verfahren notwendig werden, die eine Verknüpfung zwischen statistischen Lernmethoden und großen Wissensrepräsentationen (Ontologien) herstellen und Nachvollziehbarkeit, Verständlichkeit und Erklärbarkeit erlauben – dem Ziel von ,,explainable AI“. Online verfügbar: