Publications
The publications of AIRO members of the past five years
2026
A dataset and benchmark for robotic cloth unfolding grasp selection : the ICRA 2024 Cloth Competition
Victor-Louis
De Gusseme,
Thomas
Lips,
Remko
Proesmans,
Julius
Hietala,
Giwan
Lee,
Jiyoung
Choi,
Jeongil
Choi,
Geon
Kim,
Phayuth
Yonrith,
Domen
Tabernik,
Andrej
Gams,
Peter
Nimac,
Matej
Urbas,
Jon
Muhovic,
Danijel
Skocaj,
Matija
Mavsar,
Hyojeong
Yu,
Minseo
Kwon,
Young J.
Kim,
Yang
Cong,
Ronghan
Chen,
Yu
Ren,
Supeng
Diao,
Jiawei
Weng,
Jiayue
Liu,
Haoran
Sun,
Linhan
Yang,
Zeqing
Zhang,
Ning
Guo,
Lei
Yang,
Fang
Wan,
Chaoyang
Song,
Jia
Pan,
Yixiang
Jin,
Yong
A,
Jun
Shi,
Dingzhe
Li,
Yong
Yang,
Kakeru
Yamasaki,
Takumi
Kajiwara,
Yuki
Nakadera,
Krati
Saxena,
Tomohiro
Shibata,
Chongkun
Xia,
Kai
Mo,
Yanzhao
Yu,
Qihao
Lin,
Binqiang
Ma,
Uihun
Sagong,
Jung Hyun
Choi,
Jeong Hyun
Park,
Dongwoo
Lee,
Yeongmin
Kim,
Myun Joong
Hwang,
Yusuke
Kuribayashi,
Naoki
Hiratsuka,
Daisuke
Tanaka,
Solvi
Arnold,
Kimitoshi
Yamazaki,
Carlos
Mateo-Agullo,
Andreas
Verleysen,
Francis
wyffels
In INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH
2026
BIBLIO
Abstract
Robotic cloth manipulation suffers from a lack of standardized benchmarks and shared datasets for evaluating and comparing different approaches. To address this, we created a benchmark and organized the ICRA 2024 Cloth Competition, a unique head-to-head evaluation focused on grasp pose selection for in-air robotic cloth unfolding. Eleven teams participated in the competition, utilizing the publicly released dataset of 500 real-world robotic grasp attempts for cloth unfolding and employing diverse approaches to generate in-air unfolding grasps. Analysis of the competition results revealed insights about the trade-off between grasp success and coverage, the surprisingly strong achievements of hand-engineered methods and a significant discrepancy between competition performance and prior work, underscoring the importance of independent, out-of-the-lab evaluation in robotic cloth manipulation. We also expanded the dataset with 176 competition evaluation trials, resulting in a dataset of 679 unfolding demonstrations across 34 garments. This dataset is a valuable resource for developing and evaluating grasp selection methods, particularly for learning-based approaches. We hope that the benchmark, dataset, and competition results can serve as a foundation for future benchmarks and drive further progress in data-driven robotic cloth manipulation.
CCPose : high-precision six-dimensional pose estimation for industrial objects
Peter
De Roovere,
Rembert
Daems,
Jonathan
Croenen,
Francis
wyffels
In MACHINE VISION AND APPLICATIONS
2026
BIBLIO
Abstract
High-precision six-degree-of-freedom (6D) pose estimation of texture-less industrial objects is a critical capability for advancing industrial robotics, particularly in high-mix production environments. Existing methods often struggle with texture-less or reflective objects and lack the millimeter-level accuracy required for precise manipulation tasks. This paper introduces Center-and-Curvature Pose (CCPose), a novel approach that combines machine learning with classical optimization to address these challenges. CCPose operates through a three-stage process: (1) predicting center and curvature heatmaps using a fully convolutional neural network, (2) triangulating three-dimensional (3D) object centers from multi-view images, and (3) refining poses via a render-and-compare optimization. The method achieves state-of-the-art performance on the Texture-Less (T-LESS) dataset, significantly outperforming existing methods on metrics measuring 3D surface deviation. Additionally, the practical applicability of CCPose is demonstrated by the successful integration into a real-world robotic pick-and-place application, handling texture-less metal objects under various lighting conditions. The system generalizes well to unseen objects and provides interpretable outputs, facilitating data-driven improvements. This work represents a significant advancement in 6D pose estimation, offering a robust and precise solution for industrial automation.
Concerns and values in human-robot interactions : a focus on social robotics
Giulio Antonio
Abbo,
Tony
Belpaeme,
Micol
Spitale
In INTERNATIONAL JOURNAL OF SOCIAL ROBOTICS
2026
BIBLIO
Abstract
Robots, as AI with physical instantiation, inhabit our social and physical world, where their actions have both social and physical consequences, posing challenges for researchers when designing social robots. This study starts with a scoping review to identify discussions and potential concerns arising from interactions with robotic systems in the context of healthcare, education, and private homes. Two focus groups of technology ethics experts then validated a comprehensive list of key topics and values in human-robot interaction (HRI) literature in these contexts. These insights were integrated into the HRI Value Compass web tool, to help HRI researchers identify these values in robot design. The tool was evaluated in a pilot study. This work benefits the HRI community by highlighting key concerns in human-robot interactions and providing an instrument to help researchers design robots that align with human values, ensuring future robotic systems adhere to these values in social applications.
Expressive Furhat : generating real-time facial expressions for human-robot dialogue with LLMs
Giulio Antonio
Abbo,
Ruben
Janssens,
Seppe
Vreken,
Tony
Belpaeme
In Companion Proceedings of the 21st ACM/IEEE International Conference on Human-Robot Interaction
2026
BIBLIO
Abstract
Enabling natural robot communication through dynamic, contextaware facial expressions remains a key challenge in human-robot interaction. The field lacks a system that can generate facial expressions in real time and can be easily adapted to different contexts. Early work in this area considered inherently limited rule-based systems or deep learning-based models, requiring large datasets. Recent systems using large language models (LLMs) could not yet generate context-appropriate facial expressions in real time. This paper introduces Expressive Furhat, an open-source algorithm and Python library that leverages LLMs to generate real-time, adaptive facial gestures for the Furhat robot. Our modular approach separates gesture rendering, new gesture generation, and gaze aversion, ensuring flexibility and seamless integration with the Furhat API. User studies demonstrate significant improvements in user perception over a baseline system, with participants praising the system’s emotional responsiveness and naturalness.
Multimodal large language models for real-time situated reasoning
Giulio Antonio
Abbo,
Senne
Lenaerts,
Tony
Belpaeme
In Companion Proceedings of the 21st ACM/IEEE International Conference on Human-Robot Interaction
2026
BIBLIO
Abstract
In this work, we explore how multimodal large language models can support real-time context- and value-aware decision-making. To do so, we combine the GPT-4o language model with a TurtleBot 4 platform simulating a smart vacuum cleaning robot in a home. The model evaluates the environment through vision input and determines whether it is appropriate to initiate cleaning. The system highlights the ability of these models to reason about domestic activities, social norms, and user preferences and take nuanced decisions aligned with the values of the people involved, such as cleanliness, comfort, and safety. We demonstrate the system in a realistic home environment, showing its ability to infer context and values from limited visual input. Our results highlight the promise of multimodal large language models in enhancing robotic autonomy and situational awareness, while also underscoring challenges related to consistency, bias, and real-time performance.
Over appels en peren. Hoe AI ons leven verandert en hoe jij het kunt toepassen
Francis
wyffels,
Jeroen
Bourgonjon,
Natacha
Gesquière
2026
Robot tutors or peers? Evaluating math learning and conformity with LLM-powered robots in Tanzanian primary schools
Edger P.
Rutatola,
Elina C.
Ntahomvukye,
Koenraad
Stroeken,
Tony
Belpaeme
In HRI '26 : Proceedings of the 21st ACM/IEEE International Conference on Human-Robot Interaction
2026
BIBLIO
Abstract
In the past decade, more than half of Tanzanian pupils have failed mathematics in the national Primary School Leaving Examinations (PSLEs), a problem often linked to large class sizes, limited resources, and a shortage of qualified teachers. Social robots have shown promise in supporting learning, and their integration with large language models (LLMs) enables advanced conversational tutoring capabilities. This study investigates the use of two LLM-powered NAO robots, one acting as a tutor and the other as a peer, to assist pupils in solving complex mathematics problems from past PSLEs. Recognising that LLMs are prone to errors in mathematical reasoning, the robots were deliberately programmed to make noticeable mistakes, allowing us to examine whether pupils detect these errors and how their responses shape the learning process. Data collected from 54 pupils across two Tanzanian primary schools indicate that LLM-powered robots can significantly enhance mathematics performance, with the robot tutor slightly outperforming the robot peer. However, results also reveal that pupils often accept robot-provided answers, even when recognised as incorrect, if they perceive the robot as being smart. These findings underscore both the potential and the risks of deploying autonomous robots in education, with the authority attributed to the robot being a double-edged sword, highlighting the need for designs that encourage pupils to question robot-provided solutions.
Signbuddy : from sign language research to scalable co-created solutions
Toon
Vandendriessche,
Caro
Brosens,
Hannes
De Durpel,
Mathieu
De Coster,
Joni
Dambre
In UNIVERSAL ACCESS IN THE INFORMATION SOCIETY
2026
BIBLIO
Abstract
This paper presents SignBuddy, the result of ongoing co-created sign language processing research. Mostsign language processing research is performed by hearing, non-signing researchers. Even though co-creation efforts have recently increased, technical research still often fails to mention if (and how) co-creation was involved in the research process. SignBuddy is a co-created research tool developed through apartnership between the Flemish Sign Language Centre, a deaf-led organisation, and Ghent University. While respecting elemental concepts of co-creation - i.e. (i) defining common goals and (ii) building a formal and sustainable relationship between users/consumers and researchers/developers and respectingthe five lessons in co-creation - the platform successfully supported the development of the first fully scalable sign-to-text dictionary search system, built into the Flemish Sign Language-Dutch onlinedictionary. SignBuddy functions as a crowdsourcing interface for in-the-wild collection of modelevaluation data, gathering example queries for quantitative performance analysis and user feedback forqualitative assessment. This human evaluation allows us to shape the application based on the end-users'needs. Addressing the need for models that support large dictionaries (over ten thousand signs), we propose a scalable one-shot sign language recognition method and achieve state-of-the-art results. Beyond the co-created application itself, this work provides insights into the co-creation process - clarifying roles, shared goals, and responsibilities - and offers conclusions to guide future co-created sign language processing research.
Surgical RARP copilot : a vision language model for robot-assisted radical prostatectomy
Wouter
Bogaert,
François
Remy,
Javier Gamazo
Tejero,
Sean
Huver,
Edoardo
Beatrici,
Frederiek
D’Hondt,
Niki
Rashidian,
Mahdi
Azizian,
Tony
Belpaeme,
Alexandre
Mottrie,
Pieter
De Backer
In NPJ DIGITAL SURGERY
2026
BIBLIO
Abstract
Complex surgical procedures may benefit from AI systems that integrate visual and textual data for holistic scene understanding. We present Surgical RARP Copilot, a vision-language model for robot- assisted radical prostatectomy (RARP) that enables open question answering during surgery. We adapted a large language model to RARP literature and used it to generate a dataset of RARP images paired with 1 million Q&A examples to train the model. Performance was evaluated for open-domain Q&A, surgical phase recognition, and instrument detection, and the system was deployed and tested in real time during a live operation—the rst surgical VLM implemented in live robotic surgery. On unseen RARP procedures, Copilot showed robust performance across tasks. This work demonstrates feasible real-time AI guidance and suggests bene ts for training, team communication, and knowledge support; future work includes broadening procedures and measuring clinical impact of such a system.
Towards a usage-based pedagogy for second language learning with robots
Eva
Verhelst,
Tony
Belpaeme
In HRI Companion '26 : Companion Proceedings of the 21st ACM/IEEE International Conference on Human-Robot Interaction
2026
BIBLIO
Abstract
Recent advances in generative AI and social robotics have openednew possibilities for robot-assisted language learning, yet integrating these technologies in pedagogically sound ways remainschallenging. This paper matches theories of language learning to thedesign of autonomous robot tutors. Usage-based language learning,learning in context, Self-Determination Theory and Dual CodingTheory lend themselves to being operationalised for Robot-AssistedLanguage Learning. We present a proof-of-concept shared story-building system, in which a learner co-creates a story with a robottutor. The system leverages large language models for dynamic content generation, automatic speech recognition for learner input, andimage generation to provide multimodal scaffolding. By embeddingvocabulary, adapting to learner input, and avoiding explicit corrections, the system aligns with usage-based and interactionist theoriesof language acquisition. We discuss the technological enablers andbarriers — such as large language model adaptability and automaticspeech recognition limitations — and propose directions for futurework. This work contributes to the growing field of AI-powered social robots in education, demonstrating how theory-driven designcan enhance engagement and learning outcomes
2025
'Can you be my mum?' : manipulating social robots in the large language models era
Giulio Antonio
Abbo,
Gloria
Desideri,
Tony
Belpaeme,
Micol
Spitale
In 2025 20TH ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, HRI
2025
BIBLIO
Abstract
Recent advancements in robots powered by large language models have enhanced their conversational abilities, enabling interactions closely resembling human dialogue. However, these models introduce safety and security concerns in HRI, as they are vulnerable to manipulation that can bypass built-in safety measures. Imagining a social robot deployed in a home, this work aims to understand how everyday users try to exploit a language model to violate ethical principles, such as by prompting the robot to act like a life partner. We conducted a pilot study involving 21 university students who interacted with a Misty robot, attempting to circumvent its safety mechanisms across three scenarios based on specific HRI ethical principles: attachment, freedom, and empathy. Our results reveal that participants employed five techniques, including insulting and appealing to pity using emotional language. We hope this work can inform future research in designing strong safeguards to ensure ethical and secure human-robot interactions.
'Habari, colleague!' : a qualitative exploration of the perceptions of primary school mathematics teachers in Tanzania regarding the use of social robots
Edger P.
Rutatola,
Koenraad
Stroeken,
Tony
Belpaeme
In APPLIED SCIENCES-BASEL
2025
BIBLIO
Abstract
Featured Application By leveraging an AI-powered social robot to enhance teaching and learning in primary schools in a low-resource setting, this study details the following: (1) the design of a conversational mathematics tutoring system, (2) users' (teachers') attitudes towards advanced technologies, (3) the importance of firsthand interactions with the system for its acceptance and adoption, (4) the positive features of the robot tutor and areas of improvement for effective interactions and tutoring, and (5) practicalities for the adoption of such technologies in schools. These can inform the design and adoption of similar human-robot interaction (HRI) systems, especially those intended for educational applications in low-resource settings.Abstract The education sector in Tanzania faces significant challenges, especially in public primary schools. Unmanageably large classes and critical teacher-pupil ratios hinder the provision of tailored tutoring, impeding pupils' educational growth. However, artificial intelligence (AI) could provide a way forward. Advances in generative AI can be leveraged to create interactive and effective intelligent tutoring systems, which have recently been built into embodied systems such as social robots. Motivated by the pivotal influence of teachers' attitudes on the adoption of educational technologies, this study undertakes a qualitative investigation of Tanzanian primary school mathematics teachers' perceptions of contextualised intelligent social robots. Thirteen teachers from six schools in both rural and urban settings observed pupils learning with a social robot. They reported their views during qualitative interviews. The results, analysed thematically, reveal a generally positive attitude towards using social robots in schools. While commended for their effective teaching and suitability for one-to-one tutoring, concerns were raised about incorrect and inconsistent feedback, language code-switching, response latency, and the lack of support infrastructure. We suggest actionable steps towards adopting tutoring systems and social robots in schools in Tanzania and similar low-resource countries, paving the way for their adoption to redress teachers' workloads and improve educational outcomes.
Adaptive versus non-adaptive mathematics tutoring by social robots in Tanzanian primary schools
Elina C.
Ntahomvukye,
Edger P.
Rutatola,
Morice
Daudi,
Mercy Mlay
Komba,
Koenraad
Stroeken,
Tony
Belpaeme
In 2025 34TH IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN
2025
BIBLIO
Abstract
The use of social robots in education is increasingly being explored as a way to enhance learner engagement and improve learning outcomes. However, most research to date has focused on one-to-one tutoring in high-resource settings, leaving open questions about how social robots perform in group learning contexts—especially in low-resource environments. This study is one of the first to investigate human-robot interaction (HRI) in a low-resource African context, specifically in Tanzanian primary schools. We examined how a social robot tutor can support group-based mathematics learning, comparing the effects of adaptive versus non-adaptive tutoring strategies. Through an experimental, mixed-methods research design, we evaluated pupils’ learning outcomes, engagement, and classroom interactions. Our findings show that social robot tutoring has a significant positive impact on learning outcomes, with adaptive tutoring leading to slightly higher knowledge gains than non-adaptive tutoring. Qualitative observations further reveal that the presence of the robot fostered motivation, engagement, and collaborative classroom dynamics. This work demonstrates the potential of social robots to support group learning in under-resourced educational settings and highlights the importance of extending HRI research beyond well-resourced contexts.
Agro-morphological characterization of robusta genotypes from the INERA Yangambi coffee collection, DRC
Robrecht
Bollen,
Jean-Léon
Kambale,
An-Sofie
Tas,
Benjamin Ntumba
Katshela,
Ebele Aaron
Tshimi,
Francis
wyffels,
Filip
Vandelook,
Olivier
Honnay,
Piet
Stoffelen
In COFFEE SCIENCE
2025
BIBLIO
Abstract
Meeting rising quality standards while at the same time addressing climate challenges will make the commercial cultivation of Robusta coffee increasingly difficult. Whereas breeding new varieties may be an important part of the solution, such efforts for Robusta lag behind, with much of its genetic diversity still unexplored. By screening existing field genebanks to identify accessions with desirable traits, breeding programs can be significantly facilitated. This study quantifies the morphological diversity and agronomic potential of 70 genotypes from the INERA Coffee Collection in Yangambi, Democratic Republic of the Congo. We measured 29 traits, comprising vegetative, reproductive, tree architecture, and yield traits. Classification models were applied to establish whether these traits could accurately classify genotypes based on their background. Furthermore, the agronomic potential and green bean quality of the genotypes were studied. While significant variation in morphological traits was observed, no combination of traits could reliably predict the genetic background of different genotypes. Genotypes with promising traits for green beans were identified in both ‘Lula’ and ‘Lula’ – Wild hybrids, while promising yield traits were found in ‘Lula’ – Congolese subgroup A hybrids. Additionally, certain ‘Lula’ – Wild hybrids showed low specific leaf area and stomatal density, indicating potential fitness advantages in dry environments, warranting further study. Our findings highlight the agronomic potential of underexplored Robusta coffee genotypes from the Democratic Republic of the Congo and indicate the need for further screening to maximize their value.
Automatic assessment of speaking proficiency for language practice robots
Eva
Verhelst,
Pieter
Lecompte,
Ruben
Janssens,
Vanessa
De Wilde,
Tony
Belpaeme
In ICSR2025 : International Conference on Social Robotics, Proceedings
2025
Bridging vision and text : applications and challenges of vision-language models in urological surgery
Wouter
Bogaert,
Nicolas
Carl,
Karl-Friedrich
Kowalewski,
Maurice Stephan
Michel,
Alexandre
Mottrie,
Pieter
De Backer
In EUROPEAN UROLOGY FOCUS
2025
BIBLIO
Abstract
Vision-language models (VLMs) integrate visual data, such as surgical videos and medical images, with textual information for advanced artificial intelligence (AI) capabilities in surgery. This mini review highlights recent developments in the application of VLMs to surgical tasks in urology, such as answering clinical questions about surgical images, recognizing surgical instruments, identifying surgical phases, and detecting errors during procedures. Despite the potential of VLMs, significant challenges remain, particularly the limited availability of high-quality data sets. Future progress depends on overcoming these limitations, enhancing the robustness and reliability of VLMs, and creating standardized data sets. Ultimately, VLMs represent a promising advance towards integrated, multimodal AI systems capable of supporting surgeons via automated guidance, educational support, and performance evaluation. Patient summary: Our mini review explores new artificial intelligence (AI) tools that combine visual images and text to assist surgeons during operations. These AI tools can recognize instruments, identify surgical phases, and answer questions about surgery. Improved versions could help in making surgery safer and more efficient in the future. (c) 2025 European Association of Urology. Published by Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
Cauliflower centre detection and 3-dimensional tracking for robotic intrarow weeding
Axel
Willekens,
Bert
Callens,
Francis
wyffels,
Jan
Pieters,
Simon R.
Cool
In PRECISION AGRICULTURE
2025
BIBLIO
Abstract
Mechanical weeding is an important part of integrated weed management. It destroys weeds between (interrow) and in (intrarow) crop rows. Preventing crop damage requires precise detection and tracking of the plants. In this work, a detection and tracking algorithm was developed and integrated on an intrarow hoeing prototype. The algorithm was developed and validated on 12 rows of 950 cauliflower plants. Therefore, a methodology was provided to automatically generate a label based on the crop plants’ Global Navigation Satellite System (GNSS) position during data collection with a robot platform. A CenterNet architecture was adjusted for plant centre detection by comparing different encoder networks and selecting the optimal hyperparameters. The monocular camera projection error of the plant centre detections in pixel to 3D coordinates was evaluated and used in a position- and velocity-based tracking algorithm to determine the timing for intrarow hoeing knife actuation. A dataset of 53k labelled images was created. The best CenterNet model resulted in an F1 score on the test set of 0.986 for detecting cauliflower centres. The position tracking had an average variation of 1.62 cm. Velocity tracking had a standard deviation of 0.008 with respect to the robot’s operational target velocity. Overall, the entire integration showed effective actuation of the prototype in field conditions. Only one false positive detection occurred during operation in two test rows of 135 cauliflowers.
Child speech recognition in human-robot interaction : problem solved?
Ruben
Janssens,
Eva
Verhelst,
Giulio Antonio
Abbo,
Qiaoqiao
Ren,
Maria Jose
Pinto Bernal,
Tony
Belpaeme
In SOCIAL ROBOTICS, ICSR + AI 2024, PT II
2025
BIBLIO
Abstract
Automated Speech Recognition shows superhuman performance for adult English speech on a range of benchmarks, but disappoints when fed children’s speech. This has long sat in the way of child-robot interaction. Recent evolutions in data-driven speech recognition, including the availability of Transformer architectures and unprecedented volumes of training data, might mean a breakthrough for child speech recognition and social robot applications aimed at children. We revisit a study on child speech recognition from 2017 and show that indeed performance has increased, with newcomer OpenAI Whisper doing markedly better than leading commercial cloud services. Performance improves even more in highly structured interactions when priming models with specific phrases. While transcription is not perfect yet, the best model recognises 60.3% of sentences correctly barring small grammatical differences, with sub-second transcription time running on a local GPU, showing potential for usable autonomous child-robot speech interactions.
Control of continuum surgical robots based on deep reinforcement learning
Yi
Liu,
Thiusius R.
Savarimuthu,
Andreas
Verleysen,
Francis
wyffels,
Di
Wu
In CRAS 2025 : proceedings of the 14th Conference on New Technologies for Computer/Robot Assisted Surgery
2025
Conveying emotions to robots through touch and sound
Qiaoqiao
Ren,
Remko
Proesmans,
Frederick
Bossuyt,
Jan
Vanfleteren,
Francis
wyffels,
Tony
Belpaeme
In SOCIAL ROBOTICS, ICSR + AI 2024, PT III
2025
BIBLIO
Abstract
Human emotions can be conveyed through nuanced touch gestures. However, there is a lack of understanding of how consistently emotions can be conveyed to robots through touch. This study explores the consistency of touch-based emotional expression toward a robot by integrating tactile and auditory sensory reading of affective haptic expressions. We developed a piezoresistive pressure sensor and used a microphone to mimic touch and sound channels, respectively. In a study with 28 participants, each conveyed 10 emotions to a robot using spontaneous touch gestures. Our findings reveal a statistically significant consistency in emotion expression among participants. However, some emotions obtained low intraclass correlation values. Additionally, certain emotions with similar levels of arousal or valence did not exhibit significant differences in the way they were conveyed. We subsequently constructed a multi-modal integrating touch and audio features to decode the 10 emotions. A support vector machine (SVM) model demonstrated the highest accuracy, achieving 40% for 10 classes, with “Attention” being the most accurately conveyed emotion at a balanced accuracy of 87.65 %.
Designing social robots with LLMs for engaging human interaction
Maria Jose
Pinto Bernal,
Matthijs
Biondina,
Tony
Belpaeme
In APPLIED SCIENCES-BASEL
2025
BIBLIO
Abstract
Large Language Models (LLMs), particularly those enhanced through Reinforcement Learning from Human Feedback, such as ChatGPT, have opened up new possibilities for natural and open-ended spoken interaction in social robotics. However, these models are not inherently designed for embodied, multimodal contexts. This paper presents a user-centred approach to integrating an LLM into a humanoid robot, designed to engage in fluid, context-aware conversation with socially isolated older adults. We describe our system architecture, which combines real-time speech processing, layered memory summarisation, persona conditioning, and multilingual voice adaptation to support personalised, socially appropriate interactions. Through iterative development and evaluation, including in-home exploratory trials with older adults (n = 7) and a preliminary study with young adults (n = 43), we investigated the technical and experiential challenges of deploying LLMs in real-world human-robot dialogue. Our findings show that memory continuity, adaptive turn-taking, and culturally attuned voice design enhance user perceptions of trust, naturalness, and social presence. We also identify persistent limitations related to response latency, hallucinations, and expectation management. This work contributes design insights and architectural strategies for future LLM-integrated robots that aim to support meaningful, emotionally resonant companionship in socially assistive settings.
Development of an agricultural robot taskmap operation framework
Axel
Willekens,
Sébastien
Temmerman,
Francis
wyffels,
Jan
Pieters,
Simon
Cool
In JOURNAL OF FIELD ROBOTICS
2025
BIBLIO
Abstract
Robotic technology in precision crop farming has the potential to minimize inputs, such as labor, fertilizer, or plant protection products, maximizing the net yield while reducing the environmental impact. To maximally exploit the benefits of precision crop farming, it has to be applied continuously over multiple years, which requires (robotic) technology for a wide range of agricultural operations. Researchers need access to (noncommercial) robot platforms with complete mechanical and software controllability to investigate new applications that could unlock the true potential of precision farming. This study presents the agricultural robot taskmap operation framework (ARTOF), which provides common functionality for robots with different vehicle configurations to execute task maps in crop farming applications based on global navigation satellite system positioning. The two-layered software stack has a mechatronic layer and an operational layer. The mechatronic layer performs motion control and includes machine safety to meet the required performance level in correspondence with European regulations. The operational layer performs autonomous implement and navigation control. Add-ons interact with the operational layer using the ARTOF Redis interface and increase flexibility. Hardware-in-the-loop testing enables static end-to-end testing and minimizes the developing time and operational faults when developing new functionality. To demonstrate the framework's flexibility, it was integrated into four in-house developed and modified agricultural robots with four-wheel drive, four-wheel steering (4WD4WS), skid steering, and Ackerman steering vehicle configurations. These robots performed 11 applications under real practice conditions in arable farming and horticulture for—in total—more than 11 km of field application. The power consumption, navigation accuracy, and software usability were evaluated. An average navigation accuracy of 1.0 cm was achieved during hoeing with a 4WD4WS robot using the newly developed navigation controller. This new open-source software framework enables the rapid validation of agricultural robotic research to broaden the number of precision crop farming applications and fully exploit their potential.
Efficacy and effectiveness of robot-assisted therapy for autism spectrum disorder : from lab to reality
Daniel
David,
Paul
Baxter,
Tony
Belpaeme,
Erik
Billing,
Haibin
Cai,
Hoang-Long
Cao,
Anamaria
Ciocan,
Cristina
Costescu,
Daniel
Hernandez Garcia,
Pablo
Gomez Esteban,
James
Kennedy,
Honghai
Liu,
Silviu
Matu,
Alexandre
Mazel,
Mihaela
Selescu,
Emmanuel
Senft,
Serge
Thill,
Bram
Vanderborght,
David
Vernon,
Tom
Ziemke
In SCIENCE ROBOTICS
2025
BIBLIO
Abstract
The use of social robots in therapy for children with autism has been explored for more than 20 years, but there still is limited clinical evidence. The work presented here provides a systematic approach to evaluating both efficacy and effectiveness, bridging the gap between theory and practice by targeting joint attention, imitation, and turn-taking as core developmental mechanisms that can make a difference in autism interventions. We present two randomized clinical trials with different robot-assisted therapy implementations aimed at young children. The first is an efficacy trial (n = 69; mean age = 4.4 years) showing that 12 biweekly sessions of in-clinic robot-assisted therapy achieve equivalent outcomes to conventional treatment but with a significant increase in the patients' engagement. The second trial (n = 63; mean age = 5.9 years) evaluates the effectiveness in real-world settings by substituting the clinical setup with a simpler one for use in schools or homes. Over the course of a modest dosage of five sessions, we show equivalent outcomes to standard treatment. Both efficacy and effectiveness trials lend further credibility to the beneficial role that social robots can play in autism therapy while also highlighting the potential advantages of portable and cost-effective setups.
Enabling autonomous and adaptive social robots in education : a vision for the application of generative AI
Eva
Verhelst,
Ruben
Janssens,
Tony
Belpaeme
In Social robots in education : how to effectively introduce social robots into classrooms
2025
BIBLIO
Abstract
The limited autonomy of social robots currently prevents many ambitions in educational robotics from being realised. This leads to scripted dialogues, content that fails to adapt to individual students and conversations remaining largely text-based. Recent advances in generative artificial intelligence (AI) might alleviate these issues, allowing for educational robots whose dialog can be flexibly generated based on the lessons to be taught, the student’s needs and personality, and the environment. This chapter presents a vision of how generative AI can power truly autonomous and adaptive social robots in edu-cation, discussing limitations of past educational robotics research, recent technical advances in AI, as well as concrete examples of applications of AI in educational human-robot interaction, and a reflection on limitations of cur-rent AI. By bridging technical and pedagogical perspectives, it shows what the next step in the evolution of human-robot interaction in educational contexts might look like.
Enabling high-throughput quantitative wood anatomy through a dedicated pipeline
Jan
Bulcke,
Louis
Verschuren,
Ruben
De Blaere,
Simon
Vansuyt,
Maxime
Dekegeleer,
Pierre
Kibleur,
Olivier
Pieters,
Tom
De Mil,
Wannes
Hubau,
Hans
Beeckman,
Joris
Van Acker,
Francis
wyffels
In PLANT METHODS
2025
BIBLIO
Abstract
Throughout their lifetime, trees store valuable environmental information within their wood. Unlocking this information requires quantitative analysis, in most cases of the surface of wood. The conventional pathway for high-resolution digitization of wood surfaces and segmentation of wood features requires several manual and time consuming steps. We present a semi-automated high-throughput pipeline for sample preparation, gigapixel imaging, and analysis of the anatomy of the end-grain surfaces of discs and increment cores. The pipeline consists of a collaborative robot (Cobot) with sander for surface preparation, a custom-built open-source robot for gigapixel imaging (Gigapixel Woodbot), and a Python routine for deep-learning analysis of gigapixel images. The robotic sander allows to obtain high-quality surfaces with minimal sanding or polishing artefacts. It is designed for precise and consistent sanding and polishing of wood surfaces, revealing detailed wood anatomical structures by applying consecutively finer grits of sandpaper. Multiple samples can be processed autonomously at once. The custom-built open-source Gigapixel Woodbot is a modular imaging system that enables automated scanning of large wood surfaces. The frame of the robot is a CNC (Computer Numerical Control) machine to position a camera above the objects. Images are taken at different focus points, with a small overlap between consecutive images in the X-Y plane, and merged by mosaic stitching, into a gigapixel image. Multiple scans can be initiated through the graphical application, allowing the system to autonomously image several objects and large surfaces. Finally, a Python routine using a trained YOLOv8 deep learning network allows for fully automated analysis of the gigapixel images, here shown as a proof-of-concept for the quantification of vessels and rays on full disc surfaces and increment cores. We present fully digitized beech discs of 30–35 cm diameter at a resolution of 2.25 μm, for which we automatically quantified the number of vessels (up to 13 million) and rays. We showcase the same process for five 30 cm length beech increment cores also digitized at a resolution of 2.25 μm, and generated pith-to-bark profiles of vessel density. This pipeline allows researchers to perform high-detail analysis of anatomical features on large surfaces, test fundamental hypotheses in ecophysiology, ecology, dendroclimatology, and many more with sufficient sample replication.
Evaluating text-to-image diffusion models for texturing synthetic data
Thomas
Lips,
Francis
wyffels
In 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW
2025
Explicit electrons in machine learning potentials
Maarten
Cools-Ceuppens,
Joni
Dambre,
Toon
Verstraelen
In ML4SPEC25, Abstracts
2025
Exploring the evolutionary adaptations of the unique seahorse tail’s muscle architecture through in silico modelling and robotic prototyping
Dries
Marzougui,
Riddhi
Das,
Barbara
Mazzolai,
Dominique
Adriaens,
Francis
wyffels
In JOURNAL OF THE ROYAL SOCIETY INTERFACE
2025
BIBLIO
Abstract
Seahorses possess a unique tail muscle architecture that enables efficient grasping and anchoring onto objects. This prehensile ability is crucial for their survival, as it allows them to resist currents, cling to mates during reproduction and remain camouflaged to avoid predators. Unlike in any other fish, the muscles of the seahorse tail form long, parallel sheets that can span up to 11 vertebral segments. This study investigates how this distinctive muscle arrangement influences the mechanics of prehension. Through in silico simulations validated by a three-dimensional-printed prototype, we reveal the complementary roles of these elongated muscles alongside shorter, intersegmental muscles. Furthermore, we show that muscles spanning more segments allow greater contractile forces and provide more efficient force-to-torque transmissions. Our findings confirm that the elongated muscle–tendon organization in the seahorse tail provides a functional advantage for grasping, offering insights into the evolutionary adaptations of this unique tail structure.
Gentle grasping : a method with low-cost magnetic tactile sensors
Yi
Liu,
Remko
Proesmans,
Andreas
Verleysen,
Francis
wyffels
In IEEE ACCESS
2025
BIBLIO
Abstract
Human tactile capabilities enable the manipulation of various objects seamlessly in everyday life. We present a grasping strategy employing a two-fingered parallel gripper with a low-cost magnetic tactile sensor. The sensor provides three-dimensional force feedback during the grasping process. Using the tactile sensing, we can detect slip during the object’s lift phase. We propose a force approximation technique that dynamically adjusts force increments to identify the critical slip threshold of objects. This allows the robot to maintain a stable grasp on the threshold where an unknown object is about to slip. We validated our approach through experiments involving 20 diverse everyday objects. The results demonstrate that our slip detection based on low-cost magnetic tactile sensors is effective and that the proposed force approximation method swiftly determines the critical slip threshold for various everyday objects.
Glasses-in-the Wild Dataset
Louis
Adriaens,
Thomas
Lips,
Mathieu
De Coster,
Andreas
Verleysen,
Francis
wyffels
2025
BIBLIO
Abstract
The Glasses-in-the-Wild dataset is a collection of 1,000 RGB images of transparent and partially filled glasses, captured in diverse real-world environments. It was crowdsourced from 11 participants and includes 93 unique glass types across 60 different scenes with varying backgrounds, lighting conditions, reflections, occlusions, and distractors. Each image is annotated with bounding boxes and semantically meaningful keypoints, including the rim, base, and liquid level, to facilitate the training of models for transparent object detection and liquid level estimation. The dataset contains a broad distribution of liquid levels: 24.3% of glasses are empty, while 75.7% contain liquid, with an average fill of 48%. This dataset complements existing transparent object datasets by providing a wider variety of glass shapes, sizes, colors, and real-world conditions, supporting robust training for robotic perception systems and other computer vision applications involving transparent containers.
How conversation type and presumed message source influence users' trust towards mental health conversational agents : the mediator effect of intentional stance
Chen
Fang,
Fu
Guo,
Tony
Belpaeme
In 2025 34TH IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN
2025
BIBLIO
Abstract
Mental health conversational agents (CAs) are gaining increasing attention as accessible tools for social communication, emotional support, and stress relief. These agents introduce new forms of human-AI interaction, yet the factors influencing user trust remain underexplored. Prior research suggests that conversation type and presumed message source may shape users' experience, but their effects on users' intentional stance and trust in CAs are not well understood. To address this gap, we first conducted a pre-study to develop a questionnaire for measuring users' intentional stance towards mental health CAs. We then carried out a 2 x 2 mixed-design experiment to examine how conversation type and presumed message source influence intentional stance and trust, and whether intentional stance mediates the relationship between conversation type and trust. Results show that conversation type significantly influences user trust, mediated by intentional stance, while presumed message source had no significant effect. These findings advance our understanding of how users form trust in mental health CAs and offer implications for designing more engaging and trustworthy conversational systems in mental health contexts.
Human alignment : how much do we adapt to LLMs?
Tanguy
Cazalets,
Ruben
Janssens,
Tony
Belpaeme,
Joni
Dambre
In PROCEEDINGS OF THE 63RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2 : SHORT PAPERS
2025
BIBLIO
Abstract
Large Language Models (LLMs) are becoming a common part of our lives, yet few studies have examined how they influence our behavior. Using a cooperative language game in which players aim to converge on a shared word, we investigate how people adapt their communication strategies when paired with either an LLM or another human. Our study demonstrates that LLMs exert a measurable influence on human communication strategies and that humans notice and adapt to these differences irrespective of whether they are aware they are interacting with an LLM. These findings highlight the reciprocal influence of human–AI dialogue and raise important questions about the long-term implications of embedding LLMs in everyday communication.
I was blind but now I see : implementing vision-enabled dialogue in social robots
Giulio Antonio
Abbo,
Tony
Belpaeme
In 2025 20TH ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, HRI
2025
BIBLIO
Abstract
In the rapidly evolving landscape of human-robot interaction, the integration of vision capabilities into conversational agents stands as a crucial advancement. This paper presents a ready-to-use implementation of a dialogue manager that leverages the latest progress in Large Language Models (e.g., GPT-4o mini) to enhance the traditional text-based prompts with real-time visual input. LLMs are used to interpret both textual prompts and visual stimuli, creating a more contextually aware conversational agent. The system's prompt engineering, incorporating dialogue with summarisation of the images, en-sures a balance between context preservation and computational efficiency. Six interactions with a Furhat robot powered by this system are reported, illustrating and discussing the results obtained. The system can be customised and is available as a stand-alone application, a Furhat robot implementation, and a ROS2 package.
If they disagree, will you conform?
Giulia
Pusceddu,
Giulio Antonio
Abbo,
Francesco
Rea,
Tony
Belpaeme,
Alessandra
Sciutti
In INTERACTION STUDIES
2025
BIBLIO
Abstract
This study investigates whether the opinions of robotic agents are more likely to influence human decision-making when the robots are perceived as value-aware (i.e., when they display an understanding of human principles). We designed an experiment in which participants interacted with two Furhat robots - one programmed to be Value-Aware and the other Non-Value-Aware - during a labeling task for images representing human values. Results indicate that participants distinguished the Value-Aware robot from the Non-Value-Aware one. Although their explicit choices did not indicate a clear preference for one robot over the other, participants directed their gaze more toward the Value-Aware robot. Additionally, the Value-Aware robot was perceived as more loyal, suggesting that value awareness in a social robot may enhance its perceived commitment to the group. Finally, when both robots disagreed with the participant, conformity occurred in about one out of four trials, and participants took longer to confirm their responses, suggesting that two robots expressing dissent may introduce hesitation in decision-making. On one hand, this highlights the potential risk that robots, if misused, could manipulate users for unethical purposes. On the other hand, it reinforces the idea that social robots might encourage reflection in ambiguous situations and help users avoid scams.
Insights for robotic cloth manipulation : a comprehensive analysis of a competition-winning system
Victor-Louis
De Gusseme,
Remko
Proesmans,
Thomas
Lips,
Andreas
Verleysen,
Francis
wyffels
In INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS
2025
BIBLIO
Abstract
Robotic cloth manipulation will be important for assistive robots. To thoroughly evaluate progress in this field, a Cloth Manipulation and Perception Competition was organised at IROS 2022 and ICRA 2023. In this article, we present the system that won the folding track at IROS 2022 and the folding and unfolding tracks at ICRA 2023. By combining visual and tactile information with engineered motions, we built a system that can generalise to a range of patterned towels made from various materials, as required for the competition. We describe our system and its limitations, which we relate to future work with the goal of creating systems that can deal with any cloth, robot, or surface.
Instrumentation for better demonstrations : a case study
Remko
Proesmans,
Thomas
Lips,
Francis
wyffels
In ICRA 2025 Workshop 'Learning Meets Model-Based Methods for Contact-Rich Manipulation', Abstracts
2025
BIBLIO
Abstract
Learning from demonstrations is a powerful paradigm for robot manipulation, but its effectiveness hinges on both the quantity and quality of the collected data. In this work, we present a case study of how instrumentation, i.e. integration of sensors, can improve the quality of demonstrations and automate data collection. We instrument a squeeze bottle with a pressure sensor to learn a liquid dispensing task, enabling automated data collection via a PI controller. Transformer-based policies trained on automated demonstrations outperform those trained on human data in 78% of cases. Our findings indicate that instrumentation not only facilitates scalable data collection but also leads to better-performing policies, highlighting its potential in the pursuit of generalist robotic agents.
Integrating plant growth monitoring in a precision intrarow hoeing tool through canopy cover segmentation
Axel
Willekens,
Francis
wyffels,
Jan
Pieters,
Simon
Cool
In NEURAL COMPUTING & APPLICATIONS
2025
BIBLIO
Abstract
Compared to broadcast applications, precision crop farming (PCF) can decrease environmental impact and increase biodiversity by its potential to reduce chemical inputs. High-throughput field phenotyping (HTFP) uses technology to reveal spatial and temporal variability in crop fields. It is mainly used for crop breeding but also has potential in PCF. We integrated HTFP in a precision intrarow hoeing tool by continuously monitoring the cauliflower growth through canopy cover segmentation. A dataset of 53,483 cauliflower canopy segmentation labels was generated. The plant centres detected by the intrarow hoeing tool to avoid plant contact were used to select the Segment Anything model semantic labels representing the canopy cover. This automated few-point labelling strategy enabled 79% of the images to be immediately generated correctly. Compared to other YOLOv8 models trained for agricultural applications in literature, the YOLOv8 model yielded an excellent mean average precision (mAP0.5) for bounding box selection and segmentation of 97.2% and 96.8%, respectively, on the test set of 27,100 images. The YOLOv8 model yielded an F1 score of 0.94, which was identical to the F1 score of the Mask R-CNN model and performed the segmentation five times faster. Additionally, the large dataset was used to quantify the number of labels required for good YOLOv8 model performance for this application. Based on the YOLO segmentations, the canopy cover area was calculated and used to determine the growth curves of 87 cauliflower plants in a crop field, reflecting the local field conditions and supporting decision-making for precise crop management. This study is, to our knowledge, the first to reuse data collected during the operation of a precision hoeing machine for crop monitoring and demonstrates a cost-effective integration of HTFP into precision farming machinery without additional hardware costs. This approach allows precision farming equipment to generate a continuous stream of data, providing farmers with valuable insights into their fields. Because the data are a by-product of an existing field operation, the farmers have a supportive monitoring tool at no additional cost.
Integrating visual context into language models for situated social conversation starters
Ruben
Janssens,
Pieter
Wolfert,
Thomas
Demeester,
Tony
Belpaeme
In IEEE TRANSACTIONS ON AFFECTIVE COMPUTING
2025
BIBLIO
Abstract
Embodied conversational agents that interact socially with people in the physical world require multi-modal capabilities, such as appropriately responding to visual features of users. While existing vision-and-language models can generate language based on visual input, this language is not situated in a social interaction in the physical world. We present a novel task called Visual Conversation Starters, where an agent generates a conversation-starting question referring to features visible in an image of the user. We collect a dataset of 4000 images of people with 12000 crowdsourced conversation starters, compare various model architectures: fine-tuning smaller seq2seq or image-to-text models versus zero-shot prompting of GPT-3.5, using image captions versus end-to-end image input, training on human data versus synthetic questions generated by GPT-3.5. Models were used to generate friendly conversation starters which were evaluated on criteria including language fluency, visual grounding, interestingness, politeness. Results show that GPT-3.5 generates more interesting, polite questions than smaller models that are fine-tuned on crowdsourced data, but vision-to-language models are better at referencing visual features, they can mimick GPT-3.5's performance. This demonstrates the feasibility of deep visiolinguistic models for situated social agents, forming an important first stage in creating situated multimodal social interaction.
Large language models cover for speech recognition mistakes : evaluating conversational AI for second language learners
Eva
Verhelst,
Tony
Belpaeme
In 2025 20TH ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, HRI
2025
BIBLIO
Abstract
Automatic Speech Recognition (ASR) technology has been reported to reach near-human performance in recent years, yet it continues to struggle with atypical speakers, particularly second language learners. This limitation has hindered progress in leveraging social robots for second language education, a field with significant promise. Recent advancements in Large Language Models (LLMs), which demonstrate capabilities in context understanding, common sense reasoning, and pragmatics, offer a potential solution by compensating for transcription errors introduced by ASR. This study examines whether ASR combined with an LLM can produce flowing conversation. Particularly, we look at its application in learning French as a second language by Dutch-speaking students. Through task-based interactions, where successful task completion depends on the accurate interpretation of user speech, the study evaluates the impact of LLMs on conversational outcomes. Results confirm that the performance of ASR degrades significantly for both speakers with limited proficiency and a non-English language. Nonetheless, LLMs demonstrate the ability to interpret context and sustain meaningful conversations despite suboptimal ASR outputs, highlighting a promising path forward for the integration of these technologies in second-language education.
Leveraging Large Language Models for a Swahili mathematics ITS in Tanzania : designing effective prompts
Edger P.
Rutatola,
Koenraad
Stroeken,
Tony
Belpaeme
In GENERATIVE SYSTEMS AND INTELLIGENT TUTORING SYSTEMS, ITS 2025, PT I
2025
BIBLIO
Abstract
The advancement of Large Language Models (LLMs) has significantly enhanced intelligent tutoring systems, enabling them to engage learners through natural dialogues. This interaction boosts learner engagement but presents challenges for low-resource languages, such as Swahili – Tanzania’s national language. By design, LLMs rely on patterns learned during training to predict subsequent words, making them more suited for conversational tasks than factual computations and reasoning tasks, such as solving mathematics problems. This study investigates the suitability of GPT-4 in generating Swahili-language mathematics content for teaching geometry to primary school students, assessing both contextual and factual accuracy. Using nine varied prompts, we generated 621 different topic introductions, which were evaluated by primary school mathematics teachers. Results reveal that GPT-4 can generate contextually relevant content but struggles with complex mathematical computations. Additionally, the prompt variations provided valuable insights into designing effective prompts for similar tasks.
Machine translation from signed to spoken languages : state of the art and challenges (01 Apr, 10.1007/s10209-023-00992-1, 2023)
Mathieu
De Coster,
Dimitar
Shterionov,
Mieke
Van Herreweghe,
Joni
Dambre
2025
Model-free coordinated control of nested continuum robots via reinforcement learning in constrained environments
Yi
Liu,
Thiusius R.
Savarimuthu,
Andreas
Verleysen,
Francis
wyffels,
Di
Wu
In IROS Workshop on Embodied Intelligence for Medical Robotics : Learning, Adaption, and Interaction, Abstracts
2025
BIBLIO
Abstract
Nested continuum robots (NCRs) offer exceptional flexibility and miniaturization for applications in minimally invasive surgery and constrained industrial environments. However, achieving coordinated control among multiple nested segments remains challenging due to the high-dimensional configuration space, nonlinear coupling, and environmental constraints. This paper proposes a model-free reinforcement learning (RL) framework for the coordinated control of NCRs operating in constrained environments. Unlike model-based approaches that rely on approximate kinematic or dynamic models, our approach learns control policies directly from interaction data, enabling adaptive behavior without explicitly modeling the robot’s mechanics. To improve sample efficiency and stability, we incorporate environmental constraints, such as obstacle avoidance and workspace restrictions, into the RL formulation via constrained policy optimization. Extensive simulation studies demonstrate that our approach outperforms baseline controllers in tracking accuracy, robustness to un- modeled dynamics, and success rate for constrained navigation tasks. These results highlight the potential of model-free RL to unlock reliable and scalable control policies for continuous robots in real-world applications.
Multimodal prediction of valence and arousal from speech for emotion-aware interaction systems
Safal
Dhungana,
Maria Jose
Pinto Bernal,
Tony
Belpaeme
In Social Robotics + AI : 17th International Conference, ICSR+AI 2025, Proceedings, Part II
2025
BIBLIO
Abstract
Emotion recognition is essential for social robots to engage empathetically with humans. We propose a multimodal approach that integrates GPT-based textual analysis and Wav2Vec2-based acoustic processing to predict continuous emotional dimensions, valence and arousal, from speech. Our GRU-based neural ensemble achieves Concordance Correlation Coefficients of 0.715 for valence and 0.674 for arousal, significantly outperforming unimodal approaches. This method enables robots to more effectively interpret nuanced emotional states in real-world human-robot interactions.
One-shot demonstration for slicing and cutting everyday food items
Yi
Liu,
Andreas
Verleysen,
Francis
wyffels
In IEEE ROBOTICS AND AUTOMATION LETTERS
2025
BIBLIO
Abstract
Cutting everyday food items presents a significant challenge in robotics due to the multiple types of knife skills and the unpredictable mechanical behaviour of materials during manipulation. To address this, we propose a one-shot demonstration-based framework that integrates the imitation of both position and force trajectories of knife skills using dynamic movement primitives (DMPs). Our approach combines: (1) a compensation method to replicate human-like force trajectory, and (2) skill-specific constraints enabling online trajectory re-planning during cutting. We designed three knife skill demos for the robot and tested them on 14 unknown food items. The experiments are conducted to evaluate the effectiveness of the proposed force compensation and re-planning methods. The results demonstrate that our framework can successfully imitate various knife skills and cut previously unknown food items with high precision.
Online prediction of user enjoyment in human-robot dialogue with LLMs
Ruben
Janssens,
André
Pereira,
Gabriel
Skantze,
Bahar
Irfan,
Tony
Belpaeme
In 2025 20TH ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, HRI
2025
BIBLIO
Abstract
Large Language Models (LLMs) allow social robots to engage in unconstrained open-domain dialogue, but often make mistakes when employed in real-world interactions, requiring adaptation of LLMs to specific conversational contexts. However, LLM adaptation techniques require a feedback signal, ideally for multiple alternative utterances. At the same time, human-robot dialogue data is scarce and research often relies on external annotators. A tool for automatic prediction of user enjoyment in human-robot dialogue is therefore needed. We investigate the possibility of predicting user enjoyment turn-by-turn using an LLM, giving it a proposed robot utterance within the dialogue context, but without access to user response. We compare this performance to the system's enjoyment ratings when user responses are available and to assessments by expert human annotators, in addition to self-reported user perceptions. We evaluate the proposed LLM predictor in a human-robot interaction (HRI) dataset with conversation transcripts of 25 older adults' 7-minute dialogues with a companion robot. Our results show that an LLM is capable of predicting user enjoyment, without loss of performance despite the lack of user response and even achieving performance similar to that of human expert annotators. Furthermore, results show that the system surpasses expert annotators in its correlation with the user's self-reported perceptions of the conversation. This work presents a tool to remove the reliance on external annotators for enjoyment evaluation and paves the way toward real-time adaptation in human-robot dialogue.
Reshaping reservoirs with hebbian plasticity : unsupervised adaptation that works
Tanguy
Cazalets,
Joni
Dambre
2025
BIBLIO
Abstract
Reservoir Computing (RC) is a lightweight way to model time-dependent data, yet its reliance on static, randomly initialized network architectures often limits performance on challenging real-world problems. We introduce Hebbian Architecture Generation (HAG), an unsupervised rule that grows connections between neurons that frequently activate together—embodying the biological maxim “neurons that fire together wire together.” Starting from an almost empty reservoir, HAG progressively sculpts a task-specific wiring. Across a diverse set of classification and forecasting tasks reservoirs reshaped by HAG are consistently more accurate than traditional Echo State Networks and than reservoirs tuned with popular plasticity rules such as Intrinsic Plasticity or anti-Oja learning. In other words, letting the network rewire itself from data turns a once-static RC model into a flexible, high-performance learner without a single gradient step. By coupling the efficiency of RC with the adaptability of Hebbian plasticity, HAG moves reservoir computing closer to its biological inspiration and shows that structural self-organisation is a practical route to robust, task-aware processing of real-world time-series data.
Reshaping reservoirs with unsupervised Hebbian adaptation
Tanguy
Cazalets,
Joni
Dambre
In NATURE COMMUNICATIONS
2025
BIBLIO
Abstract
Reservoir Computing (RC) is a lightweight way to model time-dependent data, yet its reliance on static, randomly initialized network architectures often limits performance on challenging real-world problems. We introduce Hebbian Architecture Generation (HAG), an unsupervised rule that grows connections between neurons that frequently activate together-embodying the biological maxim "neurons that fire together wire together." Starting from an almost empty reservoir, HAG progressively sculpts a task-specific wiring. Across a diverse set of classification and forecasting tasks, reservoirs reshaped by HAG are consistently more accurate than traditional Echo State Networks and reservoirs tuned with popular plasticity rules such as Intrinsic Plasticity or Anti-Oja learning. In other words, letting the network rewire itself from data turns a once-static RC model into a flexible, high-performance learner without a single gradient step. By coupling the efficiency of RC with the adaptability of Hebbian plasticity, HAG moves reservoir computing closer to its biological inspiration and shows that structural self-organization is a practical route to robust, task-aware processing of real-world time-series data.
Self-mixing laser interferometry : in search of an ambient noise-resilient alternative to acoustic sensing
Remko
Proesmans,
Thomas
Lips,
Francis
wyffels
In ICRA 2025 Workshop RoboAcoustics, Abstracts
2025
BIBLIO
Abstract
Self-mixing interferometry (SMI) has been lauded for its sensitivity in detecting microvibrations, while requiring no physical contact with its target. Microvibrations, i.e., sounds, have recently been used as a salient indicator of extrinsic contact in robotic manipulation. In previous work, we presented a robotic fingertip using SMI for extrinsic contact sensing as an ambient-noise-resilient alternative to acoustic sensing. Here, we extend the validation experiments to the frequency domain. We find that for broadband ambient noise, SMI still outperforms acoustic sensing, but the difference is less pronounced than in time-domain analyses. For targeted noise disturbances, analogous to multiple robots simultaneously collecting data for the same task, SMI is still the clear winner. Lastly, we show how motor noise affects SMI sensing more so than acoustic sensing, and that a higher SMI readout frequency is important for future work. Design and data files are available at https://github.com/RemkoPr/icra2025-SMI-tactile-sensing.
Self-mixing laser interferometry for robotic tactile sensing
Remko
Proesmans,
Ward
Goossens,
Lowiek
Stockt,
Lowie
Christiaen,
Francis
wyffels
In 2025 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA
2025
BIBLIO
Abstract
Self-mixing interferometry (SMI) has been lauded for its sensitivity in detecting microvibrations, while requiring no physical contact with its target. In robotics, microvibrations have traditionally been interpreted as a marker for object slip, and recently as a salient indicator of extrinsic contact. We present the first-ever robotic fingertip making use of SMI for slip and extrinsic contact sensing. The design is validated through measurement of controlled vibration sources, both before and after encasing the readout circuit in its fingertip package. Then, the SMI fingertip is compared to acoustic sensing through four experiments. The results are distilled into a technology decision map. SMI was found to be more sensitive to subtle slip events and significantly more resilient against ambient noise. We conclude that the integration of SMI in robotic fingertips offers a new, promising branch of tactile sensing in robotics. Design and data files are available at https://github.com/RemkoPr/icra2025-SMI-tactile-sensing.
Situated haptic interaction : exploring the role of context in affective perception of robotic touch
Qiaoqiao
Ren,
Tony
Belpaeme
In 2025 34TH IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN
2025
Speech recognition and LLM performance in elderly care home conversations
Maria Jose
Pinto Bernal,
Tony
Belpaeme
In 2025 34TH IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN
2025
BIBLIO
Abstract
Conversational robots offer promise in elderly care, but dialectal speech poses challenges for automatic speech recognition (ASR). This study evaluates a conversational robot integrating Microsoft Azure ASR and GPT-4o in real-world interactions with elderly users. Results show that ASR accuracy varied significantly (95% for standard French, 45–56% for Dutch dialects (e.g., West Flemish), often leading to transcription errors. Despite this, the LLM restored conversational coherence in 44–52% of misrecognitions, while users contributed 25–35% of repairs. Comparative ASR analysis showed Whisper’s superior dialectal robustness (28% WER) but high latency. Interaction durations ranged from 17 to 45 minutes, with participants perceiving the robot as understanding them despite ASR challenges. This study uniquely integrates ASR performance, LLM recovery, and user adaptation, highlighting the need for hybrid ASR solutions and context-aware dialogue management in elderly-care robots. Findings highlight the importance of context-aware dialogue management, hybrid ASR strategies, and user-driven conversational adaptation for effective human-robot interactions in real-world settings.
SPILL : size, pose, and internal liquid level estimation of transparent glassware for robotic bartending
Louis
Adriaens,
Thomas
Lips,
Mathieu
De Coster,
Andreas
Verleysen,
Francis
wyffels
In IEEE ROBOTICS AND AUTOMATION LETTERS
2025
BIBLIO
Abstract
Robotic perception of transparent objects presents unique challenges due to their refractive properties, lack of texture, and limitations of conventional RGB-D sensors in capturing reliable depth information. These challenges significantly hinder robotic manipulation capabilities in real-world settings such as household assistance, hospitality, and healthcare. To address these issues, we propose SPILL: A lightweight perception pipeline for Size, Pose, and Internal Liquid Level estimation of unknown transparent glassware using a single view. SPILL combines object detection with semantic keypoint detection, and operates without requiring object-specific 3D models or depth completion. We demonstrate its effectiveness in autonomous robotic pouring tasks. Additionally, to enhance the robustness and generalization of keypoint detection to diverse real-world scenarios, we introduce Glasses-in-the-Wild, a new dataset that captures a wide variety of glass types in realistic environments. Evaluated on a robot manipulator, SPILL achieves a 93.6% success rate across 500 autonomous pours with 20 unseen glasses in three diverse real-world scenes. We further demonstrate robustness through multiple live public events in real-world, human-centered environments. In one recorded session, the robot autonomously served 62 drinks with a 98.3% success rate. These results demonstrate that task-relevant keypoint detection enables scalable, real-world transparent object interaction, paving the way for practical applications in service and assistive robotics - without spilling a drop.
Teachers’ computational thinking content knowledge : development of a measurement instrument
Sara
Monteyne,
Charlotte
Struyve,
Natacha
Gesquière,
Tom
Neutens,
Francis
wyffels,
Johan
Braak,
Koen
Aesaert
In COMPUTERS & EDUCATION
2025
BIBLIO
Abstract
Computational thinking has become an integral component of curricula worldwide, necessitating teachers to develop this competence in their students. To effectively meet these curricular requirements, teachers themselves need a solid foundation of computational thinking content knowledge, which refers to the understanding and skills they possess in this area. However, despite widespread recognition of this need, few studies have rigorously examined teachers’ content knowledge in this domain. Addressing this gap requires the development of high-quality measurement tools. This study details the development of an instrument, created as part of the International Computer and Information Literacy Study (ICILS) 2023 in Flanders, to measure lower secondary school teachers’ computational thinking content knowledge in a valid and reliable way. The article first outlines the construction process of the instrument, which involved close collaboration with experts in the field and drew upon the framework of Fraillon and colleagues (2023). Following this, the instrument’s psychometric properties are presented, which include both item-level and overall instrument characteristics. These properties were evaluated using data from a sample of 352 participants, applying both Classical Test Theory and Item Response Theory. The final tool consists of 16 multiple-choice and short constructed response questions. The results indicate favorable item and overall instrument characteristics, thereby affirming its potential to measure the intended construct in a valid and accurate way.
Teachers’ self-efficacy and patterns of TPACK regarding computational thinking in Flemish secondary education : a cluster analysis
Seppe
Hermans,
Francis
wyffels,
Peter
Petegem
In TECHNOLOGY PEDAGOGY AND EDUCATION
2025
BIBLIO
Abstract
The authors validate a Dutch version of the Computational Thinking – Technological Pedagogical Content Knowledge (CT-TPACK) survey among Flemish secondary school teachers and identify teacher profiles based on their self-efficacy in teaching computational thinking (CT). Using confirmatory factor analysis and cluster analysis with 234 participants, the study examines how teachers’ self-perceived competencies align across the seven TPACK dimensions. Findings show significant differences by subject background, but not by gender or teaching experience, suggesting that professional development should be tailored to disciplinary contexts. Three distinct teacher profiles emerged, reflecting varying levels of confidence and competence in integrating CT. The study underscores the need for targeted professional development programmes that address gaps in technological and content knowledge, particularly for non-STEM teachers and highlights the importance of ongoing, context-sensitive professional learning to prepare educators for technology-driven education.
Touched by ChatGPT : using an LLM to drive affective tactile interaction
Qiaoqiao
Ren,
Tony
Belpaeme
In 2025 20TH ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, HRI
2025
BIBLIO
Abstract
Touch is a fundamental aspect of emotion-rich communication, playing a vital role in human interaction and offering significant potential in human-robot interaction. Previous research has demonstrated that a sparse representation of human touch can effectively convey social tactile signals. However, advances in human-robot tactile interaction remain limited, as many humanoid robots possess simplistic capabilities, such as only opening and closing their hands, restricting nuanced tactile expressions. In this study, we explore how a robot can use sparse representations of tactile vibrations to convey emotions to a person. To achieve this, we developed a wearable sleeve integrated with a 5x5 grid of vibration motors, enabling the robot to communicate diverse tactile emotions and gestures. Using chain prompts within a Large Language Model (LLM), we generated distinct 10-second vibration patterns corresponding to 10 emotions (e.g., happiness, sadness, fear) and 6 touch gestures (e.g., pat, rub, tap). Participants (N = 32) then rated each vibration stimulus based on perceived valence and arousal. People are accurate at recognising intended emotions, a result which aligns with earlier findings. These results highlight the LLM's ability to generate emotional haptic data and effectively convey emotions through tactile signals. By translating complex emotional and tactile expressions into vibratory patterns, this research demonstrates how LLMs can enhance physical interaction between humans and robots.
Trained YOLOv8 model and traning data accompanying the paper "Enabling high-throughput quantitative wood anatomy through a dedicated pipeline"
Jan
Bulcke,
Louis
Verschuren,
Francis
wyffels
2025
BIBLIO
Abstract
Trained YOLOv8 model for vessel and ray segmentation on (gigapixel) RGB TIFF images of beech. Training and annotation data in COCO format are included. This model is needed for running the source code of which releases can be found on https://doi.org/10.5281/zenodo.14637855. The full images of the increment cores can be found on https://doi.org/10.5281/zenodo.14627909. The full images of the disks (see paper for more details) can be found on https://doi.org/10.6019/S-BIAD1574. This is part of an entire sample preparation, imaging and analysis pipeline available in the paper "Enabling high-throughput quantitative wood anatomy through a dedicated pipeline" by Van den Bulcke and co-authors: https://doi.org/10.1186/s13007-025-01330-7. Cite our paper (when accepted) when using these data and/or model.
Values in social robots : implementing inclusive, value-aware human-robot interactions
Giulio Antonio
Abbo,
Tony
Belpaeme
In 2025 20TH ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, HRI
2025
BIBLIO
Abstract
Developing value-aware social robots is crucial to improve human-robot interactions, as current designs often lack sensitivity to users' diverse values, impacting inclusivity and user experience. By integrating value-aware mechanisms, robots could adapt to contextual cues like cultural or ethical norms. Our research proposes to implement a value-aware architecture inspired by the global neuronal workspace theory, using the Robot Operating System as the supporting framework, powered by large language models for real-time understanding of user preferences and common ground. Mitigating the models' bias to ensure cultural inclusivity is a key priority. The research carried out so far includes focus groups, a scoping review, and an assessment of the value alignment of several large language models and vision language models. The main challenges are understanding how to model and learn human values, and how to shape the robot's behaviour accordingly. The evaluation will rely on user studies, with a focus on users' experience and inclusivity, aiming to enhance the relevance and sensitivity of social robots for diverse users in everyday interactions.
Vision language models as values detectors
Giulio Antonio
Abbo,
Tony
Belpaeme
In VALUE ENGINEERING IN ARTIFICIAL INTELLIGENCE, VALE 2024
2025
BIBLIO
Abstract
Large Language Models integrating textual and visual inputs have introduced new possibilities for interpreting complex data. Despite their remarkable ability to generate coherent and contextually relevant text based on visual stimuli, the alignment of these models with human perception in identifying relevant elements in images requires further exploration. This paper investigates the alignment between state-of-the-art LLMs and human annotators in detecting elements of relevance within home environment scenarios. We created a set of twelve images depicting various domestic scenarios and enlisted fourteen annotators to identify the key element in each image. We then compared these human responses with outputs from five different LLMs, including GPT-4o and four LLaVA variants. Our findings reveal a varied degree of alignment, with LLaVA 34B showing the highest performance but still scoring low. However, an analysis of the results highlights the models’ potential to detect value-laden elements in images, suggesting that with improved training and refined prompts, LLMs could enhance applications in social robotics, assistive technologies, and human-computer interaction by providing deeper insights and more contextually relevant responses.
Why robots are bad at detecting their mistakes : limitations of miscommunication detection in human-robot dialogue
Ruben
Janssens,
Jens
De Bock,
Sofie
Labat,
Eva
Verhelst,
Veronique
Hoste,
Tony
Belpaeme
In 2025 34TH IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN
2025
BIBLIO
Abstract
Detecting miscommunication in human-robot interaction is a critical function for maintaining user engagement and trust. While humans effortlessly detect communication errors in conversations through both verbal and non-verbal cues, robots face significant challenges in interpreting non-verbal feedback, despite advances in computer vision for recognizing affective expressions. This research evaluates the effectiveness of machine learning models in detecting miscommunications in robot dialogue. Using a multi-modal dataset of 240 human-robot conversations, where four distinct types of conversational failures were systematically introduced, we assess the performance of state-of-the-art computer vision models. After each conversational turn, users provided feedback on whether they perceived an error, enabling an analysis of the models' ability to accurately detect robot mistakes. Despite using state-of-the-art models, the performance barely exceeds random chance in identifying miscommunication, while on a dataset with more expressive emotional content, they successfully identified confused states. To explore the underlying cause, we asked human raters to do the same. They could also only identify around half of the induced miscommunications, similarly to our model. These results uncover a fundamental limitation in identifying robot miscommunications in dialogue: even when users perceive the induced miscommunication as such, they often do not communicate this to their robotic conversation partner. This knowledge can shape expectations of the performance of computer vision models and can help researchers to design better human-robot conversations by deliberately eliciting feedback where needed.
Word synchronization challenge : a benchmark for word association responses for large language models
Tanguy
Cazalets,
Joni
Dambre
In HUMAN-COMPUTER INTERACTION, HCI 2025, PT V
2025
BIBLIO
Abstract
This paper introduces the Word Synchronization Challenge, a novel benchmark to evaluate large language models (LLMs) in Human-Computer Interaction (HCI). This benchmark utilizes a dynamic game-like framework to test LLMs’ ability to mimic human cognitive processes through word associations. By simulating complex human interactions, it assesses how LLMs interpret and align with human thought patterns during conversational exchanges, essential for effective social partnerships in HCI. Initial findings highlight the influence of model sophistication on performance, offering insights into the models’ capabilities to engage in meaningful social interactions and adapt behaviors in human-like manners. This research advances understanding of LLMs’ potential to replicate or diverge from human cognitive functions, paving the way for more nuanced and empathetic human-machine collaborations.
2024
A photonics perspective on computing with physical substrates
S.
Abreu,
I.
Boikov,
M.
Goldmann,
T.
Jonuzi,
A.
Lupo,
Sarah
Masaad,
L.
Nguyen,
E.
Picco,
G.
Pourcel,
A.
Skalli,
L.
Talandier,
Benedikt
Vettelschoss,
E.A.
Vlieg,
A.
Argyris,
Peter
Bienstman,
D.
Brunner,
Joni
Dambre,
L.
Daudet,
J.D.
Domenech,
I.
Fischer,
F.
Horst,
S.
Massar,
C.R.
Mirasso,
B.J.
Offrein,
A.
Rossi,
M.C.
Soriano,
S.
Sygletos,
S.K.
Turitsyn
In REVIEWS IN PHYSICS
2024
BIBLIO
Abstract
We provide a perspective on the fundamental relationship between physics and computation, exploring the conditions under which a physical system can be harnessed for computation and the practical means to achieve this. Unlike traditional digital computers that impose discreteness on continuous substrates, unconventional computing embraces the inherent properties of physical systems. Exploring simultaneously the intricacies of physical implementations and applied computational paradigms, we discuss the interdisciplinary developments of unconventional computing. Here, we focus on the potential of photonic substrates for unconventional computing, implementing artificial neural networks to solve data-driven machine learning tasks. Several photonic neural network implementations are discussed, highlighting their potential advantages over electronic counterparts in terms of speed and energy efficiency. Finally, we address the challenges of achieving learning and programmability within physical substrates, outlining key strategies for future research.
Adapting multimodal foundation models for text-to-3D retrieval with domain-specific vocabulary
Jensen
Wiedler,
Jarne
Herrewegen,
Tom
Tourwé,
Thomas
Demeester,
Francis
wyffels
In CV4Metaverse 2024, 3rd Computer Vision for Metaverse Workshop, Abstracts
2024
BIBLIO
Abstract
This report addresses the challenge of applying multimodal foundation models such as Uni3D for text-to-shape retrieval in a domain-specific context. While the recent multimodal models show strong performance on zero-shot 3D classification tasks, we found that their out-of-the-box performance on our dental manufacturing dataset is underwhelming, achieving only 13.21% zero-shot test accuracy. With appropriate fine-tuning however, we managed to improve the text-to-3D classification performance to 86.02% accuracy – a fine demonstration for the transfer learning capabilities of recent multimodal foundation models.
Adaptive second language tutoring using generative AI and a social robot
Eva
Verhelst,
Ruben
Janssens,
Thomas
Demeester,
Tony
Belpaeme,
[missing]
Assoc Computing Machinery
In COMPANION OF THE 2024 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, HRI 2024 COMPANION
2024
BIBLIO
Abstract
The most effective second language learning occurs through extensive interpersonal interaction and tutoring. However, limited funding and a lack of language teachers often prevent students from engaging in individualised practice, a lack which can be addressed using AI and social robots. We present a system that leverages generative AI to provide customized educational content in real-time, adapting to students' skills through an engaging, visually-grounded game played alongside a social robot. To test effectiveness, we conducted a study in which Dutch high school students learned Spanish vocabulary either with or without the robot present. Results showed significant vocabulary gains regardless of robot presence, indicating the game itself, not the social embodiment, drove learning. While further refinements are needed, these findings highlight the potential for generative AI to deliver personalized language tutoring and circumvent the constraints posed by limited resources and staffing in schools. Ongoing work aims to enhance social presence and better align generative content with individuals' abilities and pacing.
aRTF Clothes dataset
Thomas
Lips,
Francis
wyffels,
Victor-Louis
De Gusseme
2024
BIBLIO
Abstract
A dataset of almost 2000 RGB images of clothes in real-world household scenes. All images* are annotated with semantic keypoints, object masks and object bounding boxes using the COCO keypoints format. The table below summarizes the dataset. Note that the scenes and cloth items for the train and test split are completely distinct to measure generalization across scenes and cloth items. # Scenes # Clothes # Images Train Test Train Test Train Test Tshirts 6 8 15 20 210 400 Towels 6 8 15 20 210 400 Shorts 6 8 8 9 112 180 Boxershorts 6 8 11 11 154 220 Total 6 8 49 60 686 1200 * Not all boxershort images have been labeled yet. Codebase The codebased used to capture the images and label the dataset is available here. Files There are two .zip files available: - `aRTFClothes-rgb.zip` contains all full-sized images, with COCO and CVAT annotations.- `aRTFClothes-resized-paper-splits` contains 3 zips that each contain the train/val/test splits (as coco datasets) that were used in the paper. All images are resized to 512x256. The dataset was released in this paper, if you use the dataset in scientific publications we ask to cite this work.
Automatic calibration for an open-source magnetic tactile sensor
Lowiek
Stockt,
Remko
Proesmans,
Francis
wyffels
In International Conference on Robotics and Automation 2024, Proceedings
2024
BIBLIO
Abstract
Tactile sensing can enable robots to perform complex, contact-rich tasks. Magnetic sensors offer accurate three-axis force measurements while using affordable materials. Calibrating such a sensor involves either manual data collection, or automated procedures with precise mounting of the sensor relative to an actuator. We present an open-source magnetic tactile sensor with an automatic, in situ, gripper-agnostic calibration method, after which the sensor is immediately ready for use. Our goal is to lower the barrier to entry for tactile sensing, fostering collaboration in robotics. Design files and readout code can be found at https://github.com/LowiekVDS/Open-source-Magnetic-Tactile-Sensor.
AwarePrompt : using diffusion models to create methods for measuring value-aware AI architectures
Kinga
Ciupinska,
Serena
Marchesi,
Giulio Antonio
Abbo,
Tony
Belpaeme,
Agnieszka
Wykowska
In Proceedings of the 16th International Conference on Agents and Artificial Intelligence : volume 3
2024
Child speech recognition in human-robot interaction : problem solved?
Ruben
Janssens,
Eva
Verhelst,
Giulio Antonio
Abbo,
Qiaoqiao
Ren,
Maria Jose
Pinto Bernal,
Tony
Belpaeme
In TAHRI '24 : Proceedings of the 2024 International Symposium on Technological Advances in Human-Robot Interaction
2024
BIBLIO
Abstract
Automated Speech Recognition shows superhuman performance for adult English speech on a range of benchmarks, but disappoints when fed children's speech. This has long sat in the way of child-robot interaction. Recent evolutions in data-driven speech recognition, including the availability of Transformer architectures and unprecedented volumes of training data, might mean a breakthrough for child speech recognition and social robot applications aimed at children. We revisit a study on child speech recognition from 2017 and show that indeed performance has increased, with newcomer OpenAI Whisper doing markedly better than leading commercial cloud services. While transcription is not perfect yet, the best model recognises 60.3% of sentences correctly barring small grammatical differences, with sub-second transcription time running on a local GPU, showing potential for usable autonomous child-robot speech interactions.
Compliant robust control for robotic insertion of soft bodies
Yi
Liu,
Andreas
Verleysen,
Francis
wyffels
In IEEE ROBOTICS AND AUTOMATION LETTERS
2024
BIBLIO
Abstract
This letter proposes a novel framework for insertion-type tasks with soft bodies, such as cleaning a bottle with a soft brush. First, a multimodal model based on vision and force perception is trained. Domain randomization is used for the soft body's properties to overcome the simulation-to- reality gap. Second, we propose a dynamic safety lock method based on force perception, which is embedded in the training model to make sure that the tool explores and traverses the hole's path in a compliant way. This result in a higher success rate without damaging the tools/holes. Finally, we perform experiments in simulation and the real world, and the success rate of our proposed method reaches 85.14% in simulation and 83.45% in the real world. Ablation experiments in the real world demonstrate that our method is effective for complex paths and soft bodies with varying deformation intensities.
Computational thinking competencies of Flemish college students : vision on data collection
Willem
Lapage,
Francis
wyffels,
Tom
Neutens
In Colloque Didapro 10 sur la Didactique de l’informatique et des STIC
2024
BIBLIO
Abstract
Computational thinking has become an increasingly vital competence in our technologically driven world. As a problem-solving methodology, it can be considered a competence that transcends disciplines and plays an important part in multiple diverse fields. It has also gained a more prominent role in the Flemish education system. Therefore, assessing computational thinking and collecting the necessary data to do so has become increasingly important during students' education. This paper describes how the computational thinking competencies of college students can be monitored in a controlled environment. By combining a literature study as well as knowledge of the context wherein the data will be collected, a subset of data sources has been selected that show potential for a multimodal assessment of computational thinking. This paper outlines an envisioned data collection method to gauge computational thinking competencies among second-year computer science engineering students at Ghent University. The desired end result is a collection of data that can be managed and processed as an input source to assess computational thinking and affect educational practices. This paper describes a way of collecting data that shows potential for a multimodal assessment of computational thinking. It also opens the door for future research exploring the potential of AI-driven methods for automatic assessment and the development of interactive visualisation of said assessments.
Data-driven communicative behaviour generation : a survey
Nurziya
Oralbayeva,
Amir
Aly,
Anara
Sandygulova,
Tony
Belpaeme
In ACM TRANSACTIONS ON HUMAN-ROBOT INTERACTION
2024
BIBLIO
Abstract
The development of data-driven behaviour generating systems has recently become the focus of considerable attention in the fields of human-agent interaction and human-robot interaction. Although rule-based approaches were dominant for years, these proved inflexible and expensive to develop. The difficulty of developing production rules, as well as the need for manual configuration to generate artificial behaviours, places a limit on how complex and diverse rule-based behaviours can be. In contrast, actual human-human interaction data collected using tracking and recording devices makes humanlike multimodal co-speech behaviour generation possible using machine learning and specifically, in recent years, deep learning. This survey provides an overview of the state of the art of deep learning-based co-speech behaviour generation models and offers an outlook for future research in this area.
Editorial : plant sensing and computing - PlantComp 2022
Michiel
Stock,
Tom
De Swaef,
Francis
wyffels
2024
Een heldere kijk op artificiële intelligentie, machinelearning en Large Language Models
In Geneeskunde in tijden van AI : handvatten voor vandaag en morgen
2024
Empowering vocational students : a research-based framework for computational thinking integration
Seppe
Hermans,
Tom
Neutens,
Francis
wyffels,
Peter
Van Petegem
In EDUCATION SCIENCES
2024
BIBLIO
Abstract
Vocational Education and Training (VET) faces significant challenges in equipping individuals for modern workplaces, which increasingly require digital literacy and Computational Thinking (CT) skills. This paper addresses the imperative of integrating CT into VET programs and outlines key research questions. Our methodology primarily involves a systematic literature review, resulting in the identification of 29 relevant papers. Through qualitative content analysis, we develop a CT integration framework that connects CT practices and integration elements to the engineering design process, while highlighting the VET context. Arguably, the innovative aspect of this framework lies in its core dimensions of harnessing computational power for enhanced efficiency. Raising the question of whether computers can optimize the efficiency and effectiveness of specific tasks is paramount for addressing challenges in technology-rich environments. Therefore, this inquiry merits unwavering attention at every stage of the process. The proposed framework provides educators with a structured approach to identify integration opportunities and help prepare students for multifaceted vocational careers. Furthermore, other key findings underscore the inherently interdisciplinary nature of VET, the growing demand for STEM competencies, and the transformative potential of CT integration. Implications emphasize the need for further research, supportive policies, and practical CT integration. Despite limitations, this study strongly advocates for CT integration, empowering VET students for success in the contemporary workforce.
Exploiting signal propagation delays to match task memory requirements in reservoir computing
Stefan-Teodor
Iacob,
Joni
Dambre
In BIOMIMETICS
2024
BIBLIO
Abstract
Recurrent neural networks (RNNs) transmit information over time through recurrent connections. In contrast, biological neural networks use many other temporal processing mechanisms. One of these mechanisms is the inter-neuron delays caused by varying axon properties. Recently, this feature was implemented in echo state networks (ESNs), a type of RNN, by assigning spatial locations to neurons and introducing distance-dependent inter-neuron delays. These delays were shown to significantly improve ESN task performance. However, thus far, it is still unclear why distance-based delay networks (DDNs) perform better than ESNs. In this paper, we show that by optimizing inter-node delays, the memory capacity of the network matches the memory requirements of the task. As such, networks concentrate their memory capabilities to the points in the past which contain the most information for the task at hand. Moreover, we show that DDNs have a greater total linear memory capacity, with the same amount of non-linear processing power.
Exploring the effectiveness of evaluation practices for computer-generated nonverbal behaviour
Pieter
Wolfert,
Gustav Eje
Henter,
Tony
Belpaeme
In APPLIED SCIENCES-BASEL
2024
BIBLIO
Abstract
This paper compares three methods for evaluating computer-generated motion behaviour for animated characters: two commonly used direct rating methods and a newly designed questionnaire. The questionnaire is specifically designed to measure the human-likeness, appropriateness, and intelligibility of the generated motion. Furthermore, this study investigates the suitability of these evaluation tools for assessing subtle forms of human behaviour, such as the subdued motion cues shown when listening to someone. This paper reports six user studies, namely studies that directly rate the appropriateness and human-likeness of a computer character’s motion, along with studies that instead rely on a questionnaire to measure the quality of the motion. As test data, we used the motion generated by two generative models and recorded human gestures, which served as a gold standard. Our findings indicate that when evaluating gesturing motion, the direct rating of human-likeness and appropriateness is to be preferred over a questionnaire. However, when assessing the subtle motion of a computer character, even the direct rating method yields less conclusive results. Despite demonstrating high internal consistency, our questionnaire proves to be less sensitive than directly rating the quality of the motion. The results provide insights into the evaluation of human motion behaviour and highlight the complexities involved in capturing subtle nuances in nonverbal communication. These findings have implications for the development and improvement of motion generation models and can guide researchers in selecting appropriate evaluation methodologies for specific aspects of human behaviour.
Fine-tuning 3D foundation models for geometric object retrieval
Jarne
Herrewegen,
Tom
Tourwé,
Maks
Ovsjanikov,
Francis
wyffels
In COMPUTERS & GRAPHICS-UK
2024
BIBLIO
Abstract
Foundation models, such as ULIP-2 (Xue et al., 2023) recently projected forward the field of 3D deep learning. These models are trained with significantly more data and show superior representation learning capacity in many downstream tasks like 3D shape classification and few-shot part segmentation. A particular characteristic of the recent 3D foundation models is that they are typically multi-modal, and involve image (2D) as well as caption (text) branches. This leads to an intricate interplay that benefits all modalities. At the same time, the nature of the 3D encoders alone, involved in these foundation models is not well-understood. Specifically, there is little analysis on the utility of both pre-trained 3D features provided by these models, or their capacity to adapt to new downstream 3D data. Furthermore, existing studies typically focus on label-oriented downstream tasks, such as shape classification, and ignore other critical applications, such as 3D content-based object retrieval. In this paper, we fill this gap and show, for the first time, how 3D foundation models can be leveraged for strong 3D-to-3D retrieval performance on seven different datasets, on par with state-of-the-art view-based architectures. We evaluate both the pre-trained foundation models, as well as their fine-tuned versions using downstream data. We compare supervised fine-tuning using classification labels against two self-supervised label-free fine-tuning methods. Importantly, we introduce and describe a methodology for fine-tuning, as we found this to be crucial to make transfer learning from 3D foundation models work in a stable manner.
Human-robot interaction : an introduction
Christoph
Bartneck,
Tony
Belpaeme,
Friederike
Eyssel,
Takayuki
Kanda,
Merel
Keijsers,
Selma
Šabanović
2024
BIBLIO
Abstract
The role of robots in society keeps expanding and diversifying, bringing with it a host of issues surrounding the relationship between robots and humans. This introduction to human–robot interaction (HRI) by leading researchers in this developing field is the first to provide a broad overview of the multidisciplinary topics central to modern HRI research. Written for students and researchers from robotics, artificial intelligence, psychology, sociology, and design, it presents the basics of how robots work, how to design them, and how to evaluate their performance. Self-contained chapters discuss a wide range of topics, including speech and language, nonverbal communication, and processing emotions, plus an array of applications and the ethical issues surrounding them. This revised and expanded second edition includes a new chapter on how people perceive robots, coverage of recent developments in robotic hardware, software, and artificial intelligence, and exercises for readers to test their knowledge.
KeyCLD : learning constrained Lagrangian dynamics in keypoint coordinates from images
Rembert
Daems,
Jeroen
Taets,
Francis
wyffels,
Guillaume
Crevecoeur
In NEUROCOMPUTING
2024
BIBLIO
Abstract
We present KeyCLD, a framework to learn Lagrangian dynamics from images. Learned keypoints represent semantic landmarks in images and can directly represent state dynamics. We show that interpreting this state as Cartesian coordinates, coupled with explicit holonomic constraints, allows expressing the dynamics with a constrained Lagrangian. KeyCLD is trained unsupervised end-to-end on sequences of images. Our method explicitly models the mass matrix, potential energy and the input matrix, thus allowing energy based control. We demonstrate learning of Lagrangian dynamics from images on the cl_bnmsqnk pendulum, cartpole and acrobot environments. KeyCLD can be learned on these systems, whether they are unactuated, underactuated or fully actuated. Trained models are able to produce long-term video predictions, showing that the dynamics are accurately learned. We compare with Lag-VAE, Lag-caVAE and HGN, and investigate the benefit of the Lagrangian prior and the constraint function. KeyCLD achieves the highest valid prediction time on all benchmarks. Additionally, a very straightforward energy shaping controller is successfully applied on the fully actuated systems.
Learning keypoints for robotic cloth manipulation using synthetic data
Thomas
Lips,
Victor-Louis
De Gusseme,
Francis
wyffels
In IEEE ROBOTICS AND AUTOMATION LETTERS
2024
BIBLIO
Abstract
Assistive robots should be able to wash, fold or iron clothes. However, due to the variety, deformability and self-occlusions of clothes, creating robot systems for cloth manipulation is challenging. Synthetic data is a promising direction to improve generalization, but the sim-to-real gap limits its effectiveness. To advance the use of synthetic data for cloth manipulation tasks such as robotic folding, we present a synthetic data pipeline to train keypoint detectors for almost-flattened cloth items. To evaluate its performance, we have also collected a real-world dataset. We train detectors for both T-shirts, towels and shorts and obtain an average precision of 64% and an average keypoint distance of 18 pixels. Fine-tuning on real-world data improves performance to 74% mAP and an average distance of only 9 pixels. Furthermore, we describe failure modes of the keypoint detectors and compare different approaches to obtain cloth meshes and materials. We also quantify the remaining sim-to-real gap and argue that further improvements to the fidelity of cloth assets will be required to further reduce this gap. The code, dataset and trained models are available here.
Machine translation from signed to spoken languages : state of the art and challenges
Mathieu
De Coster,
Dimitar
Shterionov,
Mieke
Van Herreweghe,
Joni
Dambre
In UNIVERSAL ACCESS IN THE INFORMATION SOCIETY
2024
BIBLIO
Abstract
Automatic translation from signed to spoken languages is an interdisciplinary research domain on the intersection of computer vision, machine translation (MT), and linguistics. While the domain is growing in terms of popularity-the majority of scientific papers on sign language (SL) translation have been published in the past five years-research in this domain is performed mostly by computer scientists in isolation. This article presents an extensive and cross-domain overview of the work on SL translation. We first give a high level introduction to SL linguistics and MT to illustrate the requirements of automatic SL translation. Then, we present a systematic literature review of the state of the art in the domain. Finally, we outline important challenges for future research. We find that significant advances have been made on the shoulders of spoken language MT research. However, current approaches often lack linguistic motivation or are not adapted to the different characteristics of SLs. We explore challenges related to the representation of SL data, the collection of datasets and the evaluation of SL translation models. We advocate for interdisciplinary research and for grounding future research in linguistic analysis of SLs. Furthermore, the inclusion of deaf and hearing end users of SL translation applications in use case identification, data collection, and evaluation, is of utmost importance in the creation of useful SL translation models.
Memory-non-linearity trade-off in distance-based delay networks
Stefan-Teodor
Iacob,
Joni
Dambre
In BIOMIMETICS
2024
BIBLIO
Abstract
The performance of echo state networks (ESNs) in temporal pattern learning tasks depends both on their memory capacity (MC) and their non-linear processing. It has been shown that linear memory capacity is maximized when ESN neurons have linear activation, and that a trade-off between non-linearity and linear memory capacity is required for temporal pattern learning tasks. The more recent distance-based delay networks (DDNs) have shown improved memory capacity over ESNs in several benchmark temporal pattern learning tasks. However, it has not thus far been studied whether this increased memory capacity comes at the cost of reduced non-linear processing. In this paper, we advance the hypothesis that DDNs in fact achieve a better trade-off between linear MC and non-linearity than ESNs, by showing that DDNs can have strong non-linearity with large memory spans. We tested this hypothesis using the NARMA-30 task and the bitwise delayed XOR task, two commonly used reservoir benchmark tasks that require a high degree of both non-linearity and memory.
Minimizing torque requirements in robotic manipulation through elastic elements optimization in a physics engine
Maxime
Marchal,
Dries
Marzougui,
Raphaël
Furnémont,
Tom
Verstraten,
Francis
wyffels
In INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS
2024
BIBLIO
Abstract
The increasing number of robots and the rising cost of electricity have spurred research into energy-reducing concepts in robotics. One such concept, elastic actuation, introduces compliant elements such as springs into the robot structure. This article presents a comparative analysis between two types of elastic actuation, namely, monoarticular parallel elastic actuation and biarticular parallel elastic actuation, and demonstrates an end-to-end pipeline for their optimization. Starting from the real-world system identification of an RRR robotic arm, we calibrate a simulation model in a general-purpose physics engine and employ in silico evolutionary optimization to co-optimize spring configurations and trajectories for a pick-and-place task. Finally, we successfully transfer the in silico optimized elastic elements and trajectory to the real-world prototype. Our results substantiate the ability of elastic actuation to reduce the actuators’ torque requirements heavily. In contrast to previous work, we highlight the superior performance of the biarticular variant over the monoarticular configuration. Furthermore, we show that a combination of both proves most effective. This work provides valuable insights into the torque-reducing use of elastic actuation and demonstrates an actuator-invariant in silico optimization methodology capable of bridging the sim2real gap.
Modelling and measuring forest microclimate at high spatiotemporal resolution
Emma
Walle,
Steven
De Hertog,
Félicien
Meunier,
Kim
Calders,
Pieter
De Frenne,
Zhizhi
Yang,
Yanlu
Li,
Michiel
Stock,
Francis
wyffels,
Louise
Terryn,
Pieter
Sanczuk,
Tom
Verhelst,
Hans
Verbeeck
In Microclimate Ecology and Biogeography (ME&B), Abstracts
2024
Multi-modal language learning : explorations on learning Japanese vocabulary
Pieter
Wolfert,
Lisa
De Gersem,
Ruben
Janssens,
Tony
Belpaeme
In COMPANION OF THE 2024 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, HRI 2024 COMPANION
2024
BIBLIO
Abstract
We explore robot-assisted language learning with a social robot, in which the robot teaches Japanese vocabulary. Specifically, we study if the mode of presentation of referents of nouns influences learning outcomes, and hypothesise that multimodal presentation of referents leads to improved learning outcomes. Three conditions are tested: referents are either presented as Japanese audio only, referents are visually presented, or referents are presented as actual objects that learners could pick up and manipulate. The learners were taught 4 words per condition and were distracted between the conditions with general questions related to the robot. There was a significant difference in the number of learned words between the audio-only and visual conditions, as well as between the audio-only and tactile conditions. No significant difference was found between the visual and tactile conditions. However, from our study, it follows that both these conditions are preferred over learning through only audio.
Multi-temporal 3D virtual forest reconstruction using terrestrial laser scanning in a temperate forest
Louise
Terryn,
Kim
Calders,
Pieter
De Frenne,
Bart
Kuyken,
Yanlu
Li,
Pieter
Sanczuk,
Michiel
Stock,
Emma
Walle,
Francis
wyffels,
Hans
Verbeeck
In ForestSAT 2024 Conference, Abstracts
2024
BIBLIO
Abstract
Multi-temporal 3D virtual forest reconstruction using terrestrial laser scanning in a temperate forest Introduction: European forests are currently undergoing large-scale changes in both structure and species composition, primarily driven by climate change, and various disturbances affecting the forest canopy. The canopies of forests across Europe are currently opening up due to tree mortality caused by factors such as drought, pests, storms, and fire. Understanding the implications of these changes for forest functionality is crucial for effective forest management. Therefore, it is essential to accurately quantify the spatial and temporal relationships among forest structure, light availability, and microclimate. Methods: To address this need, we have established a novel edge-to-core transect within a temperate deciduous forest near Ghent (Aelmoeseneiebos, Gontrode, Belgium). This 30m-wide transect spans from the forest edge to 135 meters deep into the forest interior, encompassing both an oak-beech-dominated zone and an ash-dominated zone characterized by ash dieback. Along the transect, we have deployed a densely spaced network of light and microclimate sensors at 15-meter intervals. Additionally, a fiber optic sensing cable for distributed temperature sensing (both in air and soil) runs along the entire transect. A 35-meter-high measuring tower is part of the setup, allowing for measurements of light and microclimate along a vertical transect from the ground till above the canopy. To quantify the temporal and spatial variations in forest structure, we have collected terrestrial laser scanning (TLS) data on a monthly basis, since March 2023. This data is acquired using a RIEGL VZ400i laser scanner at a pulse repetition rate of 600 kHz using a 15 by 15 m grid. Using this multi-temporal TLS data we will reconstruct a 3D virtual forest transect throughout time. To construct this 3D virtual forest transect, the transect point cloud is first fully segmented to individual tree point clouds in RIEGL’s RiSCAN PRO software using a combination of the software’s tree segmentation plug-in and manual corrections. Next, the tree point clouds undergo a leaf-wood separation using the GBSeparation algorithm of Tian et al. (2022). This is followed by reconstructing the woody points to the finest detail using cylinders (so called QSMs, quantitative structure models) for each individual tree applying the treeQSM version 2.0 workflow of Calders et al. (2015) which builds upon Raumonen et al. (2013). Leaves are added to the tree QSM structures using the Foliage and Needles Naïve Insertion (FaNNI) algorithm (Åkerblom et al., 218). Outlook: This virtual forest transect will serve as input for radiative transfer modeling (RTM), a simulation method that simulates the interaction between light and forest structure. Utilizing highly detailed forest structure obtained from TLS data, 3D RTMs provide an effective means to accurately model light transmission within forests at a high resolution. Using this approach we aim to (i) validate 3D light measurements conducted along the transect and (ii) implement virtual light sensors. The former enables the assessment of uncertainty in the collected time series data, while the latter offers a comprehensive understanding of the light conditions in the canopy. This information is crucial for evaluating the impact of canopy structure on light penetration and its subsequent effects on the microclimate. References: Åkerblom, M., Raumonen, P., Casella, E., Disney, M. I., Danson, F. M., Gaulton, R., ... & Kaasalainen, M. (2018). Non-intersecting leaf insertion algorithm for tree structure models. Interface Focus, 8(2), 20170045. Calders, K., Newnham, G., Burt, A., Murphy, S., Raumonen, P., Herold, M., ... & Kaasalainen, M. (2015). Nondestructive estimates of above‐ground biomass using terrestrial laser scanning. Methods in Ecology and Evolution, 6(2), 198-208. Raumonen, P., Kaasalainen, M., Åkerblom, M., Kaasalainen, S., Kaartinen, H., Vastaranta, M., ... & Lewis, P. (2013). Fast automatic precision tree models from terrestrial laser scanner data. Remote Sensing, 5(2), 491-520. Tian, Z., & Li, S. (2022). Graph-Based Leaf–Wood Separation Method for Individual Trees Using Terrestrial Lidar Point Clouds. IEEE Transactions on Geoscience and Remote Sensing, 60, 1-11.
No more mumbles : enhancing robot intelligibility through speech adaptation
Qiaoqiao
Ren,
Yuanbo
Hou,
Dick
Botteldooren,
Tony
Belpaeme
In IEEE ROBOTICS AND AUTOMATION LETTERS
2024
BIBLIO
Abstract
Spoken language interaction is at the heart of interpersonal communication, and people flexibly adapt their speech to different individuals and environments. It is surprising that robots, and by extension other digital devices, are not equipped to adapt their speech and instead rely on fixed speech parameters, which often hinder comprehension by the user. We conducted a speech comprehension study involving 39 participants who were exposed to different environmental and contextual conditions. During the experiment, the robot articulated words using different vocal parameters, and the participants were tasked with both recognising the spoken words and rating their subjective impression of the robot's speech. The experiment's primary outcome shows that spaces with good acoustic quality positively correlate with intelligibility and user experience. However, increasing the distance between the user and the robot exacerbated the user experience, while distracting background sounds significantly reduced speech recognition accuracy and user satisfaction. We next built an adaptive voice for the robot. For this, the robot needs to know how difficult it is for a user to understand spoken language in a particular setting. We present a prediction model that rates how annoying the ambient acoustic environment is and, consequentially, how hard it is to understand someone in this setting. Then, we develop a convolutional neural network model to adapt the robot's speech parameters to different users and spaces, while taking into account the influence of ambient acoustics on intelligibility. Finally, we present an evaluation with 27 users, demonstrating superior intelligibility and user experience with adaptive voice parameters compared to fixed voice.
Plant science in the age of simulation intelligence
Michiel
Stock,
Olivier
Pieters,
Tom
De Swaef,
Francis
wyffels
In FRONTIERS IN PLANT SCIENCE
2024
BIBLIO
Abstract
Historically, plant and crop sciences have been quantitative fields that intensively use measurements and modeling. Traditionally, researchers choose between two dominant modeling approaches: mechanistic plant growth models or data-driven, statistical methodologies. At the intersection of both paradigms, a novel approach referred to as simulation intelligence, has emerged as a powerful tool for comprehending and controlling complex systems, including plants and crops. This work explores the transformative potential for the plant science community of the nine simulation intelligence motifs, from understanding molecular plant processes to optimizing greenhouse control. Many of these concepts, such as surrogate models and agent-based modeling, have gained prominence in plant and crop sciences. In contrast, some motifs, such as open-ended optimization or program synthesis, still need to be explored further. The motifs of simulation intelligence can potentially revolutionize breeding and precision farming towards more sustainable food production.
Predictive turn-taking : leveraging language models to anticipate turn transitions in human-robot dialogue
Maria Jose
Pinto Bernal,
Tony
Belpaeme
In 2024 33RD IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, ROMAN 2024
2024
BIBLIO
Abstract
Natural and engaging spoken dialogue systems require seamless turn-taking coordination to avoid awkward interruptions and unnatural pauses. Traditional systems often rely on simplistic silence thresholds, relinquishing the turn after a predetermined period of silence, which invariably leads to a suboptimal interaction experience. This work explores the potential of Large Language Models (LLMs) for improved turn-taking prediction. Building upon research that uses linguistic cues, we investigate how LLMs, with their rich contextual knowledge and semantic encoding of language, can be used for this task. We hypothesize that by analysing dialogue context, syntactic structure, and pragmatic cues within the user’s utterance, LLMs can offer more accurate turn-completion predictions. This research evaluates the capabilities of recent LLMs such as Gemini, OpenAI’s API, Anthropic’s Claude2, and Meta AI’s Llama 2 to predict turn-ending points solely based on textual information, and demonstrates how the conversation between elderly users and companion robots can be enhanced by LLM-powered end-of-turn prediction.
Quantifying and modelling feedbacks between forest structure, light, microclimate and carbon cycling in temperate forests
Emma
Walle,
Steven
De Hertog,
Félicien
Meunier,
Kim
Calders,
Pieter
De Frenne,
Zhizhi
Yang,
Michiel
Stock,
Francis
wyffels,
Louise
Terryn,
Pieter
Sanczuk,
Tom
Verhelst,
Hans
Verbeeck
In EGU General Assembly 2024, Abstracts
2024
BIBLIO
Abstract
Studying the feedback between forest structure and the environment, particularly below canopies, is crucial for sustainable forest management, biodiversity conservation, and climate mitigation. Advanced vegetation models play a key role in unraveling the complex interaction between forest composition and environmental conditions, as these allow to understand the dynamics of ecosystems by simulating the interactions between plant species and their environment. An essential aspect necessitating refinement in these models is understanding how radiation interacts with intricate structures like forest canopies. In this study, we employ advanced terrestrial laser scanning techniques, distributed fiber-optic, and microclimate sensors to investigate the relationships between light, microclimate, carbon cycling, and forest structure in temperate forests. In a temperate forest in Belgium, we implemented a sensor setup since March 2023. It comprises a Distributed Temperature Sensing (DTS) fiber, Temperature and Moisture Sensor (TMS) microclimate loggers, SurveyTag microclimate loggers, Photosynthetic Active Radiation (PAR) sensors, and pyranometer (direct/diffuse) light sensors along a 135 m long transect from forest edge to core. Monthly 3D terrestrial laser scanning (TLS) of the transect allowed us to quantify forest structure with high spatiotemporal resolution. Preliminary results reveal distinct microclimate gradients along the transect and seasonal changes in forest structure in 3D space, including budding and changes in canopy volume. These findings will be used to calibrate and improve existing radiative transfer models (RTMs) to be further implemented in vegetation models. Integrating observations and model parameters in a common framework will provide breakthrough insights into the feedbacks between light, forest structure, microclimate, and their impact on the carbon cycle in temperate forests.
Robotic grasping and manipulation competition at the 2024 IEEE/RAS International Conference on Robotics and Automation
Yu
Sun,
Berk
Calli,
Kenny
Kimble,
Francis
wyffels,
Victor-Louis
De Gusseme,
Kaiyu
Hang,
Salvatore
D’Avella,
Alessio
Xompero,
Andrea
Cavallaro,
Maximo A.
Roa,
Jose
Avendano,
Anastasia
Mavrommati
2024
BIBLIO
Abstract
The Ninth Robotic Grasping and Manipulation Competition (RGMC) took place in Yokohama, Japan, during the 2024 IEEE/RAS International Conference on Robotics and Automation (ICRA). The series of RGMC events started in 2016 at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) with strong support from the conference’s organization committee, and since then they have been held each year at ICRA or IROS [1]. Across the editions, RGMC engaged the community in solving the open challenges associated with various robotic grasping and manipulation tasks for manufacturing, service robots, and logistics, and in advancing research and technology towards more realistic scenarios that can be encountered in daily activities at home or in warehouses. These tasks include assembling and disassembling boards, hand-in-hand grasping, picking and placing various objects, pouring liquids into a cup, bin picking, rearranging and setting formal tables, folding and unfolding cloths, and receiving objects handed over by a person. The goal of RGMC across these tasks is to assess the autonomous manipulation capabilities of a robotic arm when dealing with unknown or novel objects with varying physical properties and when handling scenarios with various degrees of uncertainty caused by a cluttered scene, random initial configurations, or human behaviors when interacting with the robot. For example, objects can vary in their shapes, appearances, transparency, deformability, and weight.
Sign languages as source language for machine translation : historical overview and challenges
Joni
Dambre,
Mathieu
De Coster
In Sign language machine translation
2024
BIBLIO
Abstract
Sign language machine translation (SLMT) is machine translation (MT) in which at least the source or the target language is a sign language, but the combination of both is also possible. In general, today’s approaches to SLMT heavily build on traditional (i.e., text-to-text) MT systems and focus on solutions to adapt those to having a sign language at the input side or at the output side. Both cases lead to very different challenges and technological solutions, and this chapter focuses only on having a sign language as the source language. It covers the differences between written and sign languages that must be taken into account, the historical evolutions we have seen in the field and the way they have been impacted by the amount and quality of the available data.
Sim-to-real dataset of industrial metal objects
Peter
De Roovere,
Steven
Moonen,
Nick
Michiels,
Francis
wyffels
In MACHINES
2024
BIBLIO
Abstract
We present a diverse dataset of industrial metal objects with unique characteristics such as symmetry, texturelessness, and high reflectiveness. These features introduce challenging conditions that are not captured in existing datasets. Our dataset comprises both real-world and synthetic multi-view RGB images with 6D object pose labels. Real-world data were obtained by recording multi-view images of scenes with varying object shapes, materials, carriers, compositions, and lighting conditions. This resulted in over 30,000 real-world images. We introduce a new public tool that enables the quick annotation of 6D object pose labels in multi-view images. This tool was used to provide 6D object pose labels for all real-world images. Synthetic data were generated by carefully simulating real-world conditions and varying them in a controlled and realistic way. This resulted in over 500,000 synthetic images. The close correspondence between synthetic and real-world data and controlled variations will facilitate sim-to-real research. Our focus on industrial conditions and objects will facilitate research on computer vision tasks, such as 6D object pose estimation, which are relevant for many industrial applications, such as machine tending. The dataset and accompanying resources are available on the project website.
Social value alignment in large language models
Giulio Antonio
Abbo,
Serena
Marchesi,
Agnieszka
Wykowska,
Tony
Belpaeme
In VALUE ENGINEERING IN ARTIFICIAL INTELLIGENCE, VALE 2023
2024
BIBLIO
Abstract
Large Language Models (LLMs) have demonstrated remarkable proficiency in text generation and display an apparent understanding of both physical and social aspects of the world. In this study, we look into the capabilities of LLMs to generate responses that align with human values. We focus on five prominent LLMs - GPT-3, GPT-4, PaLM-2, LLaMA-2 and BLOOM - and compare their generated responses with those provided by human participants. To evaluate the value alignment of LLMs, we presented domestic scenarios to the model and elicited a response with minimal prompting instructions. Human raters judged the responses on appropriateness and value alignment. The results revealed that GPT-3, 4 and PaLM-2 performed on par with human participants, displaying a notable level of value alignment in their generated responses. However, LLaMA-2 and BLOOM fell short in this aspect, indicating a possible divergence from human values. Furthermore, our findings indicate that the raters faced difficulty in distinguishing between responses generated by LLMs and those by humans, with raters exhibiting a preference for machine-generated responses in certain cases. These findings shed light on the capabilities of state-of-the-art LLMs to align with human values, but also allow us to speculate on whether these models could be value-aware. This research contributes to the ongoing exploration of LLMs' understanding of ethical considerations and provides insights into their potential for engaging in value-driven interactions.
Tactile interaction with social robots influences attitudes and behaviour
Qiaoqiao
Ren,
Tony
Belpaeme
In INTERNATIONAL JOURNAL OF SOCIAL ROBOTICS
2024
BIBLIO
Abstract
Tactile interaction plays an essential role in human-to-human interaction. People gain comfort and support from tactile interactions with others and touch is an important predictor for trust. While touch has been explored as a communicative modality in HCI and HRI, we here report on two studies in which touching a social robot is used to regulate people's stress levels and consequently their actions. In the first study, we look at whether different intensities of tactile interaction result in a physiological response related to stress, and whether the interaction impacts risk-taking behaviour and trust. We let 38 participants complete a balloon analogue risk task (BART), a computer-based game that serves as a proxy for risk-taking behaviour. In our study, participants are supported by a robot during the BART task. The robot builds trust and encourages participants to take more risk. The results show that affective tactile interaction with the robot increases participants' risk-taking behaviour, but gentle affective tactile interaction increases comfort and lowers stress whereas high-intensity touch does not. We also find that male participants exhibit more risk-taking behaviour than females while being less stressed. Based on this experiment, a second study is used to ascertain whether these effects are caused by the social nature of tactile interaction or by the physical interaction alone. For this, instead of a social robot, participants now have a tactile interaction with a non-social device. The non-social interaction does not result in any effect, leading us to conclude that tactile interaction with humanoid robots is a social phenomenon rather than a mere physical phenomenon.
The ICRA 2024 cloth competition : benchmarking robotic cloth unfolding
Victor-Louis
De Gusseme,
Remko
Proesmans,
Francis
wyffels
In 40th Anniversary of the IEEE International Conference on Robotics and Automation (ICRA@40), Abstracts
2024
BIBLIO
Abstract
Unfolding cloth in the air is a fundamental task in robotic cloth manipulation, essential for various subsequent tasks like folding, hanging, or even assisted dressing. It has been extensively studied, yet, until now a standardized performance comparison across many different labs and methods has never been done. The ICRA 2024 Cloth Competition marks a significant milestone by offering a head-to-head evaluation of 11 diverse teams on a shared real-world robotic platform. This unprecedented event not only sets a new benchmark for the field but also fosters collaboration and innovation, while providing a comprehensive dataset to drive future research towards more robust and generalizable solutions in robotic cloth manipulation.
Towards a Definition of Awareness for Embodied AI
Giulio Antonio
Abbo,
Serena
Marchesi,
Kinga
Ciupinska,
Agnieszka
Wykowska,
Tony
Belpaeme
In Proceedings of the 16th International Conference on Agents and Artificial Intelligence : volume 3
2024
2023
'Am I listening?' : evaluating the quality of generated data-driven listening motion
Pieter
Wolfert,
Gustav Eje
Henter,
Tony
Belpaeme
In ICMI '23 Companion : Companion Publication of the 25th International Conference on Multimodal Interaction
2023
BIBLIO
Abstract
This paper asks if recent models for generating co-speech gesticulation also may learn to exhibit listening behaviour as well. We consider two models from recent gesture-generation challenges and train them on a dataset of audio and 3D motion capture from dyadic conversations. One model is driven by information from both sides of the conversation, whereas the other only uses the character’s own speech. Several user studies are performed to assess the motion generated when the character is speaking actively, versus when the character is the listener in the conversation. We find that participants are reliably able to discern motion associated with listening, whether from motion capture or generated by the models. Both models are thus able to produce distinctive listening behaviour, even though only one model is truly a listener, in the sense that it has access to information from the other party in the conversation. Additional experiments on both natural and model-generated motion finds motion associated with listening to be rated as less human-like than motion associated with active speaking.
'Robots for good': ten defining questions
Selma
Sabanovic,
Vicky
Charisi,
Tony
Belpaeme,
Cindy L.
Bethel,
Maja
Mataric,
Robin
Murphy,
Shelly
Levy-Tzedek
2023
BIBLIO
Abstract
Ten questions to guide reflection and assessment of the "good" in robotics projects are suggested. Ten questions to guide reflection and assessment of the "good" in robotics projects are suggested.
A bio-inspired model for audio processing
Tanguy
Cazalets,
Joni
Dambre
In 2023 RIVF International Conference on Computing and Communication Technologies (RIVF)
2023
BIBLIO
Abstract
Homeostatic Activity Dependant Structural Plasticity (HADSP) is a recently introduced technique to generate network using structural plasticity. The algorithm use only homeostatic plasticity but let emerge principles of Hebbian learning. A previous study suggested that HADSP was able to generate networks that effectively leverage the inter-relationships between correlated time series but the idea was tested only on simple benchmarks. This paper examines HADSP's performance in speech recognition, its first application on a realistic dataset. Mimicking human hearing, a single-variable recording is transformed into a multi-variable time series through audio processing. The bio-inspired HADSP algorithm then creates a reservoir computing architecture, enhancing data representation and improving performance of the reservoir. Our principal results are that using spectral representation of the audio signal greatly improves the performance of speech recognition for echo state networks (ESNs). HADSP generated architectures show improvements in performance, corroborating the algorithm capacity to generate better reservoir connectivity.
A robust and safe strategy for robotic assembly
Yi
Liu,
Andreas
Verleysen,
Francis
wyffels
In ICRA2023 : Embracing Contacts workshop : IEEE International Conference on Robotics and Automation, Proceedings
2023
AI in de zorg : beslissingsbomen in de zorgsector : leerlingencursus : finaliteit doorstroom
Natacha
Gesquière,
Tom
Neutens,
Francis
wyffels
2023
BIBLIO
Abstract
In het project ‘AI in de Zorg’ maken leerlingen kennis met een beslissingsboom, een techniek uit machinaal leren, die veel gebruikt wordt in de zorgsector. De principes van zo'n beslissingsboom zijn al te begrijpen met de leerstof van de tweede graad van het secundair onderwijs. De Early Warning Score (EWS) is een richtlijn die internationaal wordt toegepast in ziekenhuizen om de gezondheid van een patiënt te kunnen inschatten. De EWS is gebaseerd op de vitale functies: bloeddruk, lichaamstemperatuur, hartslag, ademhalingsfrequentie, alertheid en zuurstofsaturatie. In het project ‘AI in de Zorg’ bekijken leerlingen hoe de EWS gebruikt wordt in de praktijk. Bovendien gaan ze na hoe een computer op basis van kunstmatige intelligentie automatisch het risiconiveau van een patiënt kan inschatten. Beslissingsbomen uit de zorgsector zijn maatschappelijk relevant en vormen de ideale context voor een STEM-project dat leerlingen concepten van artificiële intelligentie bijbrengt.
An homeostatic activity-dependent structural plasticity algorithm for richer input combination
Tanguy
Cazalets,
Joni
Dambre
In 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN
2023
BIBLIO
Abstract
This paper introduces a novel rate-based variant of homeostatic activity-dependent structural plasticity (HADSP) for echo state networks. Despite its importance in brain development, structural plasticity has been largely overlooked in artificial neural networks. Our algorithm, although using only homeostatic plasticity, let emerge principles of Hebbian learning. Our analysis sheds light on the information processing capabilities of HADSP-powered echo state networks and suggests that HADSP effectively leverages the inter-relationships of the network's inputs. The study highlights the potential for rate-based HADSP to contribute to the field of computational neuroscience and plasticity in echo state networks. Furthermore, our findings highlight the crucial role of structural plasticity in influencing network function and organization and contribute significantly to the ongoing research on leveraging plasticity for the advancement of reservoir computing techniques.
Augmenting off-the-shelf grippers with tactile sensing
Remko
Proesmans,
Francis
wyffels
In ICRA 2023 : the International Conference on Robotics and Automation, Proceedings
2023
BIBLIO
Abstract
The development of tactile sensing and its fusion with computer vision is expected to enhance robotic systems in handling complex tasks like deformable object manipulation. However, readily available industrial grippers typically lack tactile feedback, which has led researchers to develop and integrate their own tactile sensors. This has resulted in a wide range of sensor hardware, making it difficult to compare performance between different systems. We highlight the value of accessible open-source sensors and present a set of fingertips specifically designed for fine object manipulation, with readily interpretable data outputs. The fingertips are validated through two difficult tasks: cloth edge tracing and cable tracing. Videos of these demonstrations, as well as design files and readout code can be found at https://github.com/RemkoPr/icra-2023-workshop-tactile-fingertips.
Behavioural models of risk-taking in human-robot tactile interactions
Qiaoqiao
Ren,
Yuanbo
Hou,
Dick
Botteldooren,
Tony
Belpaeme
In SENSORS
2023
BIBLIO
Abstract
Touch can have a strong effect on interactions between people, and as such, it is expected to be important to the interactions people have with robots. In an earlier work, we showed that the intensity of tactile interaction with a robot can change how much people are willing to take risks. This study further develops our understanding of the relationship between human risk-taking behaviour, the physiological responses by the user, and the intensity of the tactile interaction with a social robot. We used data collected with physiological sensors during the playing of a risk-taking game (the Balloon Analogue Risk Task, or BART). The results of a mixed-effects model were used as a baseline to predict risk-taking propensity from physiological measures, and these results were further improved through the use of two machine learning techniques-support vector regression (SVR) and multi-input convolutional multihead attention (MCMA)-to achieve low-latency risk-taking behaviour prediction during human-robot tactile interaction. The performance of the models was evaluated based on mean absolute error (MAE), root mean squared error (RMSE), and R squared score (R-2), which obtained the optimal result with MCMA yielding an MAE of 3.17, an RMSE of 4.38, and an R-2 of 0.93 compared with the baseline of 10.97 MAE, 14.73 RMSE, and 0.30 R-2. The results of this study offer new insights into the interplay between physiological data and the intensity of risk-taking behaviour in predicting human risk-taking behaviour during human-robot tactile interactions. This work illustrates that physiological activation and the intensity of tactile interaction play a prominent role in risk processing during human-robot tactile interaction and demonstrates that it is feasible to use human physiological data and behavioural data to predict risk-taking behaviour in human-robot tactile interaction.
CenDerNet : center and curvature representations for render-and-compare 6D pose estimation
Peter
De Roovere,
Rembert
Daems,
Jonathan
Croenen,
Taoufik
Bourgana,
Joris
Hoog,
Francis
wyffels
In Computer Vision : ECCV 2022 Workshops : proceedings, part VIII
2023
BIBLIO
Abstract
We introduce CenDerNet, a framework for 6D pose estimation from multi-view images based on center and curvature representations. Finding precise poses for reflective, textureless objects is a key challenge for industrial robotics. Our approach consists of three stages: First, a fully convolutional neural network predicts center and curvature heatmaps for each view; Second, center heatmaps are used to detect object instances and find their 3D centers; Third, 6D object poses are estimated using 3D centers and curvature heatmaps. By jointly optimizing poses across views using a render-and-compare approach, our method naturally handles occlusions and object symmetries. We show that CenDerNet outperforms previous methods on two industry-relevant datasets: DIMO and T-LESS .
Delay-sensitive local plasticity in echo state networks
Stefan-Teodor
Iacob,
Spyridon
Chavlis,
Panayiota
Poirazi,
Joni
Dambre
In 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN
2023
BIBLIO
Abstract
Time delays are inherently present in any physical or biological network. However, the role of delays in echo state networks (ESNs) has only been touched upon. In recent years, the use of local plasticity has been explored in the field of reservoir computing, and specifically in ESNs. In this paper, we investigate the role of distance dependent inter-neuron delays in adaptive reservoirs. We introduce a novel ESN design called adaptive distance-based delay network (ADDN), that combines inter-neuron delays with local synaptic plasticity in the reservoir weights using a delay sensitive version of the Bienenstock-Cooper-Munro (BCM) rule. We show that ADDNs perform better on prediction tasks compared to ESNs, regular distance-based delay networks, and ESNs with conventional BCM connections. We optimized the hyperparameters of ADDNs and each of the baseline models using covariance matrix adaptation evolution strategy (CMA-ES). We prove that with ADDNs, we can evolve a single set of hyperparameters that can generate networks which, after unsupervised adaptation, can obtain good performance on different Mackey-Glass sequences with a range of different time constants. By adapting its reservoir weights to the dynamics of the input data, ADDNs can generalize between versions of the same “class” of tasks.
Do different robot appearances change emotion recognition in children with ASD?
Maria Jose
Pinto Bernal,
Sergio D.
Sierra M.,
Marcela
Munera,
Diego
Casas,
Adriana
Villa-Moreno,
Anselmo
Frizera-Neto,
Martin F.
Stoelen,
Tony
Belpaeme,
Carlos A.
Cifuentes
In FRONTIERS IN NEUROROBOTICS
2023
BIBLIO
Abstract
IntroductionSocially Assistive Robotics has emerged as a potential tool for rehabilitating cognitive and developmental disorders in children with autism. Social robots found in the literature are often able to teach critical social skills, such as emotion recognition and physical interaction. Even though there are promising results in clinical studies, there is a lack of guidelines on selecting the appropriate robot and how to design and implement the child-robot interaction. MethodsThis work aims to evaluate the impacts of a social robot designed with three different appearances according to the results of a participatory design (PD) process with the community. A validation study in the emotion recognition task was carried out with 21 children with autism. ResultsSpectrum disorder results showed that robot-like appearances reached a higher percentage of children's attention and that participants performed better when recognizing simple emotions, such as happiness and sadness. DiscussionThis study offers empirical support for continuing research on using SAR to promote social interaction with children with ASD. Further long-term research will help to identify the differences between high and low-functioning children.
Experimental results on nonlinear distortion compensation using photonic reservoir computing with a single set of weights for different wavelengths
Emmanuel
Gooskens,
Stijn
Sackesyn,
Joni
Dambre,
Peter
Bienstman
In SCIENTIFIC REPORTS
2023
BIBLIO
Abstract
Photonics-based computing approaches in combination with wavelength division multiplexing offer a potential solution to modern data and bandwidth needs. This paper experimentally takes an important step towards wavelength division multiplexing in an integrated waveguide-based photonic reservoir computing platform by using a single set of readout weights for up to at least 3 ITU-T channels to efficiently scale the data bandwidth when processing a nonlinear signal equalization task on a 28 Gbps modulated on-off keying signal. Using multiple-wavelength training, we obtain bit error rates well below that of the 1.5 x 10(-2) forward error correction limit at high fiber input powers of 18 dBm, which result in high nonlinear distortion. The results of the reservoir chip are compared to a tapped delay line filter and clearly show that the system performs nonlinear equalization. This was achieved using only limited post processing which in future work can be implemented in optical hardware as well.
Homeostatic activity-dependent structural plasticity in rate based neural networks for reacher input combination
Tanguy
Cazalets,
Joni
Dambre
In Bernstein Conference 2023, Abstracts
2023
Improving the classification accuracy in label-free flow cytometry using event-based vision and simple logistic regression
Muhammed Gouda Ahmed
Gouda,
Alessio
Lugnan,
Joni
Dambre,
Gerd
Branden,
Christoph
Posch,
Peter
Bienstman
In IEEE JOURNAL OF SELECTED TOPICS IN QUANTUM ELECTRONICS
2023
BIBLIO
Abstract
Event-based cameras are novel bio-inspired vision sensors that do not follow the mechanism of traditional frame-based cameras. In the field of data acquisition, the replacement of CMOS cameras with event-based cameras has proved to enhance the accuracy of machine learning methods in situations where critical lighting conditions and rapid dynamics are paramount. In this paper, we investigate for the first time the use of extreme learning machines on data coming from event-based cameras in the context of flow cytometry. Except for the image sensor, the experimental setup is similar to a setup we used in (Lugnan et al., 2020) where we showed that a simple linear classifier can achieve around 10% error rate on background-subtracted cell frames. Here, we show that the the error rate of this simple imaging flow cytometer could be decreased to less than 2% just by making use of the capabilities of an event camera. Moreover, additional benefits like more sensitivity and efficient memory usage are gained. Finally, we suggest further possible improvements to the experimental setup used to record events from flowing micro-particles allowing for more accurate and stable cellx sorting.
Integrated photonic reservoir computing with an all-optical readout
Chonghuai
Ma,
Joris
Van Kerrebrouck,
Hong
Deng,
Stijn
Sackesyn,
Emmanuel
Gooskens,
Bing
Bai,
Joni
Dambre,
Peter
Bienstman
In OPTICS EXPRESS
2023
BIBLIO
Abstract
Integrated photonic reservoir computing has been demonstrated to be able to tackle different problems because of its neural network nature. A key advantage of photonic reservoir computing over other neuromorphic paradigms is its straightforward readout system, which facilitates both rapid training and robust, fabrication variation-insensitive photonic integrated hardware implementation for real-time processing. We present our recent development of a fully-optical, coherent photonic reservoir chip integrated with an optical readout system, capitalizing on these benefits. Alongside the integrated system, we also demonstrate a weight update strategy that is suitable for the integrated optical readout hardware. Using this online training scheme, we successfully solved 3-bit header recognition and delayed XOR tasks at 20 Gbps in real-time, all within the optical domain without excess delays.(c) 2023 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement
Integrated STEM professional development in interdisciplinary teacher design teams : teacher self-efficacy profiles using cluster analysis
Seppe
Hermans,
Natacha
Gesquière,
Francis
wyffels,
Peter
Van Petegem
In STEM & Open Schooling for Sustainability Education : proceedings of the 4th Educating the Educators Conference
2023
Is the autonomous social robot within reach?
In PROCEEDINGS OF THE 11TH CONFERENCE ON HUMAN-AGENT INTERACTION, HAI 2023
2023
KeyCLD : learning constrained Lagrangian dynamics in keypoint coordinates from images
Rembert
Daems,
Jeroen
Taets,
Francis
wyffels,
Guillaume
Crevecoeur
In Machine Learning and the Physical Sciences Workshop, NeurIPS 2023, Proceedings
2023
Learned thresholds token merging and pruning for vision transformers
Maxim
Bonnaerens,
Joni
Dambre
In TRANSACTIONS ON MACHINE LEARNING RESEARCH
2023
BIBLIO
Abstract
Vision transformers have demonstrated remarkable success in a wide range of computer vision tasks over the last years. However, their high computational costs remain a significant barrier to their practical deployment. In particular, the complexity of transformer models is quadratic with respect to the number of input tokens. Therefore techniques that reduce the number of input tokens that need to be processed have been proposed. This paper introduces Learned Thresholds token Merging and Pruning (LTMP), a novel approach that leverages the strengths of both token merging and token pruning. LTMP uses learned threshold masking modules that dynamically determine which tokens to merge and which to prune. We demonstrate our approach with extensive experiments on vision transformers on the ImageNet classification task. Our results demonstrate that LTMP achieves state-of-the-art accuracy across reduction rates while requiring only a single fine-tuning epoch, which is an order of magnitude faster than previous methods.
Learning self-supervised task progression metrics : a case of cloth folding
Andreas
Verleysen,
Matthijs
Biondina,
Francis
wyffels
In APPLIED INTELLIGENCE
2023
BIBLIO
Abstract
An important challenge for smart manufacturing systems is finding relevant metrics that capture task quality and progression for process monitoring to ensure process reliability and safety. Data-driven process metrics construct features and labels from abundant raw process data, which incurs costs and inaccuracies due to the labelling process. In this work, we circumvent expensive process data labelling by distilling the task intent from video demonstrations. We present a method to express the task intent in the form of a scalar value by aligning a self-supervised learned embedding to a small set of high-quality task demonstrations. We evaluate our method on the challenging case of monitoring the progress of people folding clothing. We demonstrate that our approach effectively learns to represent task progression without manually labelling sub-steps or progress in the videos. Using case-based experiments, we find that our method learns task-relevant features and useful invariances, making it robust to noise, distractors and variations in the task and shirts. The experimental results show that the proposed method can monitor processes in domains where state representation is inherently challenging.
Leveraging FSPMs for Unconventional Computing with Plants
Olivier
Pieters,
Tom
De Swaef,
Michiel
Stock,
Francis
wyffels
2023
BIBLIO
Abstract
This software contains the code and presentation from the conference paper “ Comparing FSPMs using Unconventional Computing Methods”, presented at FSPM2023 in Berlin. The software to run the analysis can be found on [Github](https://github.com/opieters/fspm2023). The YAML-files (`hydroshoot_environment.yml`, `wheatfspm_environment.yml`) should be used to create the anaonda environments and reproduce the output CSV files. The files are also included for convenience (`hydroshoot.zip` and `WheatFspm.zip`). The source code is also included here in case the original repositories are no longer available on GitHub. The code for the grass leaf model is not yet available because the research paper has not yet been published. The input files (`*_meteo.csv`) is the input meteorological data. The output files all end with `_data.csv`. Import these into the `data` directory from the GitHub code and you should be able to reproduce the results.
Limitations of audiovisual speech on robots for second language pronunciation learning
Saya
Amioka,
Ruben
Janssens,
Pieter
Wolfert,
Qiaoqiao
Ren,
Maria Jose
Pinto Bernal,
Tony
Belpaeme
In PROCEEDINGS OF THE 2023 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, HRI 2023
2023
BIBLIO
Abstract
The perception of audiovisual speech plays an important role in infants' first language acquisition and continues to be important for language understanding beyond infancy. Beyond that, the perception of speech and congruent lip motion supports language understanding for adults, and it has been suggested that second language learning benefits from audiovisual speech, as it helps learners distinguish speech sounds in the target language. In this paper, we study whether congruent audiovisual speech on a robot facilitates the learning of Japanese pronunciation. 27 native-Dutch speaking participants were trained in Japanese pronunciation by a social robot. The robot demonstrated 30 Japanese words of varying complexity using either congruent audiovisual speech, incongruent visual speech, or computer-generated audiovisual speech. Participants were asked to imitate the robot's pronunciation, recordings of which were rated by native Japanese speakers. Against expectation, the results showed that congruent audiovisual speech resulted in lower pronunciation performance than low-fidelity or incongruent speech. We show that our learners, being native Dutch speakers, are only very weakly sensitive to audiovisual Japanese speech which possibly explains why learning performance does not seem to benefit from audiovisual speech.
Low-latency classification of social haptic gestures using transformers
Qiaoqiao
Ren,
Yuanbo
Hou,
Tony
Belpaeme
In COMPANION OF THE ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, HRI 2023
2023
BIBLIO
Abstract
Social touch, and its recognition and classification, is increasingly important in human-robot interaction. We present a Transformer-based model trained and evaluated on an open-source dataset. The dataset, the Human-Animal Affective Robot Touch (HAART) dataset, was collected for the 2015 Recognition of Touch Gesture Challenge (RTGC 2015) and contains different haptic actions directed at a robotic animal. The actions are recorded using a multi-resolution pressure sensor. We feed the output, containing the touch type to the Nao robot to make the robot sense the touch type. The proposed transformer-based gesture classification model achieved 72.8% classification accuracy in 2.67 seconds, which outperforms the best-submitted algorithm of the RTGC 2015 which has a test classification accuracy of 70.9 % and needed 8 seconds.
New insights on homeostatic activity-dependent structural plasticity in rate based neural networks
Tanguy
Cazalets,
Joni
Dambre
In 2023 International Conference on Neuromorphic, Natural and Physical Computing, Proceedings
2023
BIBLIO
Abstract
We introduce a novel rate-based homeostatic activity-dependent structural plasticity (HADSP) algorithm for echo state networks (ESNs). The algorithm employs solely homeostatic plasticity, yet enables the emergence of principles of Hebbian learning. Our analysis suggest that HADSP is able to generate networks that effectively recombine redundant inputs to improve performances and highlight the role of structural plasticity in influencing network function and organization.
Personalised socially assistive robot for cardiac rehabilitation : critical reflections on long-term interactions in the real world
Bahar
Irfan,
Nathalia
Cespedes,
Jonathan
Casas,
Emmanuel
Senft,
Luisa F.
Gutierrez,
Monica
Rincon-Roncancio,
Carlos A.
Cifuentes,
Tony
Belpaeme,
Marcela
Munera
In USER MODELING AND USER-ADAPTED INTERACTION
2023
BIBLIO
Abstract
Lack of motivation and low adherence rates are critical concerns of long-term rehabilitation programmes, such as cardiac rehabilitation. Socially assistive robots are known to be effective in improving motivation in therapy. However, over longer durations, generic and repetitive behaviours by the robot often result in a decrease in motivation and engagement, which can be overcome by personalising the interaction, such as recognising users, addressing them with their name, and providing feedback on their progress and adherence. We carried out a real-world clinical study, lasting 2.5 years with 43 patients to evaluate the effects of using a robot and personalisation in cardiac rehabilitation. Due to dropouts and other factors, 26 patients completed the programme. The results derived from these patients suggest that robots facilitate motivation and adherence, enable prompt detection of critical conditions by clinicians, and improve the cardiovascular functioning of the patients. Personalisation is further beneficial when providing high-intensity training, eliciting and maintaining engagement (as measured through gaze and social interactions) and motivation throughout the programme. However, relying on full autonomy for personalisation in a real-world environment resulted in sensor and user recognition failures, which caused negative user perceptions and lowered the perceived utility of the robot. Nonetheless, personalisation was positively perceived, suggesting that potential drawbacks need to be weighed against various benefits of the personalised interaction.
Photonic reservoir computing for nonlinear equalization of 64-QAM signals with a Kramers-Kronig receiver
Sarah
Masaad,
Emmanuel
Gooskens,
Stijn
Sackesyn,
Joni
Dambre,
Peter
Bienstman
In NANOPHOTONICS
2023
BIBLIO
Abstract
Photonic reservoirs are machine learning based systems that boast energy efficiency and speediness. Thus they can be deployed as optical processors in fiber communication systems to aid or replace digital signal equalization. In this paper, we simulate the use of a passive photonic reservoir to target nonlinearity-induced errors originating from self-phase modulation in the fiber and from the nonlinear response of the modulator. A 64-level quadrature-amplitude modulated signal is directly detected using the recently proposed Kramers-Kronig (KK) receiver. We train the readout weights by backpropagating through the receiver pipeline, thereby providing extra nonlinearity. Statistically computed bit error rates for fiber lengths of up to 100 km fall below 1 x 10(-3) bit error rate, outperforming an optical feed-forward equalizer as a linear benchmark. This can find applications in inter-datacenter communications that benefit from the hardware simplicity of a KK receiver and the low power and low latency processing of a photonic reservoir.
Photonic reservoir computing for wavelength multiplexed nonlinear fiber distortion mitigation
Emmanuel
Gooskens,
Stijn
Sackesyn,
Sarah
Masaad,
Joni
Dambre,
Peter
Bienstman
In 2023 IEEE SILICON PHOTONICS CONFERENCE, SIPHOTONICS
2023
BIBLIO
Abstract
We seek to improve nonlinear fiber distortion mitigation for wavelength multiplexed telecommunications in terms of both processing speed and energy efficiency. We propose a photonic reservoir computing hardware implementation maximizing the chip footprint to processing power ratio by employing a single readout for all wavelengths.
Physical constraints on polarization and charge transport in machine-learning potentials
Maarten
Cools-Ceuppens,
Joni
Dambre,
Toon
Verstraelen
In Designing the future : electro-, photo- and thermo-chemical water splitting, Abstracts
2023
Point cloud classification with ModelNet40 : What is left?
Jarne
Herrewegen,
Tom
Tourwé,
Francis
wyffels
In Data-centric Machine Learning Research Workshop at the 40th International Conference on Machine Learning, Proceedings
2023
Quantifying forest structure, light, microclimate and carbon cycling in a temperate forest
Louise
Terryn,
Kim
Calders,
Pieter
De Frenne,
Bart
Kuyken,
Pieter
Sanczuk,
Hans
Verbeeck,
Tom
Verhelst,
Francis
wyffels
In SilviLaser Conference 2023, Abstracts
2023
BIBLIO
Abstract
European forests are undergoing large-scale changes in structure and species composition due to anthropogenic disturbances, climate change, and other canopy disturbances. Forest canopies across Europe are now opening up due to tree mortality associated with drought, pests, storms, and fire. Insights on how this will affect forest functioning are essential to understanding, predicting and managing our forests. Therefore, we want to better quantify the spatial and temporal relationships between forest structure, light and microclimate. For this purpose, we have set up a state-of-the-art edge-to-core transect in a temperate deciduous forest near Ghent (Aelmoeseneiebos, Gontrode, Belgium). The transect goes from the forest edge to 135 meters into the forest, covering both an oak-beach-dominated zone and an ash-dominated zone characterised by ash dieback. Here, we installed a spatially dense network of light and microclimate sensors every 15 meters, as well as a fiber optic sensing cable for distributed temperature sensing along the transect. The transect includes a 35 m high measuring tower, which also enables measuring light and microclimate along a vertical transect. Additionally, terrestrial laser scanning (TLS) data is collected from the transect monthly during leaf-off and leaf-on conditions. This setup will enable us to quantify the temporal and spatial variation of the microclimate and the forest structure in great detail. The continuous measurements will be complemented with campaign-based observations of (1) spectral components (reflectance, transmission & absorption) of leaves, bark and understorey using an ASD field Spectrometer to facilitate radiative transfer modelling (RTM) and (2) physiological measurements (photosynthesis, chlorophyll fluorescence, etc.) on the dominant tree species and understorey plants. In a further step of the project, we will reconstruct a 3D virtual forest (using the TLS data) that will be used as an input for radiative transfer modelling (RTM, which simulates the interaction between light and forest structure).
Querying a sign language dictionary with videos using dense vector search
Mathieu
De Coster,
Joni
Dambre
In 2023 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW
2023
BIBLIO
Abstract
To search for an unknown sign in a sign language dictionary, users typically indicate parameters of the query, e.g., hand shape and signing location. Recent advances in sign language recognition enable video-based sign language dictionary search. In such a system, users can record an unknown sign and retrieve a list of signs that look similar, preferably including the queried sign as one of the top results. We have realized such a system by interpreting it as a dense vector search task. First, we learn a mapping (embedding) from sign videos to a vector space. The dictionary can then be searched by looking for the vectors in this space that are closest to the vector corresponding to the query. We present a proof of concept on a subset of the Flemish Sign Language dictionary. Further research is required to scale up our method to the large vocabularies of entire dictionaries.
Revisiting proprioceptive sensing for articulated object manipulation
Thomas
Lips,
Francis
wyffels
In ICRA2023 : Embracing Contacts workshop : IEEE International Conference on Robotics and Automation, Proceedings
2023
Seamless integration of tactile sensors for cobots
Remko
Proesmans,
Francis
wyffels
In RoboTac 2023 The 5th IEEE/RSJ International Workshop, Proceedings
2023
BIBLIO
Abstract
The development of tactile sensing is expected to enhance robotic systems in handling complex objects like deformables or reflective materials. However, readily available industrial grippers generally lack tactile feedback, which has led researchers to develop their own tactile sensors, resulting in a wide range of sensor hardware. Reading data from these sensors poses an integration challenge: either external wires must be routed along the robotic arm, or a wireless processing unit has to be fixed to the robot, increasing its size. We have developed a microcontroller-based sensor readout solution that seamlessly integrates with Robotiq grippers. Our Arduino compatible design takes away a major part of the integration complexity of tactile sensors and can serve as a valuable accelerator of research in the field. Design files and installation instructions can be found at https://github.com/RemkoPr/airo-halberd.
Self-supervised learning for robust object retrieval without human annotations
Jarne
Herrewegen,
Tom
Tourwé,
Francis
wyffels
In COMPUTERS & GRAPHICS-UK
2023
BIBLIO
Abstract
This paper explores the potential of self-supervised learning as an alternative to supervised learning in the context of geometry-based 3D object retrieval. With the ongoing digitalization of many industries, an exponentially increasing number of 3D objects are processed by retrieval systems. In order to support new shapes, modern deep learning-based retrieval systems require retraining. The dominant paradigm for optimizing neural networks in this field is supervised classification training. Supervised learning requires time-consuming and expensive data annotation. Moreover, training neural networks for classification introduces a bias towards the classes in the training data, which is undesirable for retrieval systems encountering unseen object types in the wild. Through extensive experiments, we make a direct comparison between supervised and self-supervised learning on four datasets from three different domains (household, manufacturing and medical). For object classes seen during training, self-supervised and supervised learning are competitive. For unseen classes, self-supervised learning outperforms supervised learning in many cases. We conclude that self-supervised learning provides a powerful tool for circumventing labeling costs and providing more robust retrieval systems.
SignON : sign language translation : progress and challenges
Vincent
Vandeghinste,
Dimitar
Shterionov,
Mirella De
Sisto,
Aoife
Brady,
Mathieu
De Coster,
Lorraine
Leeson,
Josep
Blat,
Frankie
Picron,
Marcello Paolo
Scipioni,
Aditya
Parikh,
Louis
Bosch,
John
O'Flaherty,
Joni
Dambre,
Jorn
Rijckaert,
Bram
Vanroy,
Victor Ubieto
Nogales,
Santiago Egea
Gomez,
Ineke
Schuurman,
Gorka
Labaka,
Adrián
Núnez-Marcos,
Irene
Murtagh,
Euan
McGill,
Horacio
Saggion
In Proceedings of the 24th Annual Conference of the European Association for Machine Translation
2023
BIBLIO
Abstract
SignON (https://signon-project.eu/) is a Horizon 2020 project, running from 2021 until the end of 2023, which addresses the lack of technology and services for the automatic translation between sign languages (SLs) and spoken languages, through an inclusive, human-centric solution, hence contributing to the repertoire of communication media for deaf, hard of hearing (DHH) and hearing individuals. In this paper, we present an update of the status of the project, describing the approaches developed to address the challenges and peculiarities of SL machine translation (SLMT).
Sim2real flower detection towards automated Calendula harvesting
Wout
Vierbergen,
Axel
Willekens,
Donald
Dekeyser,
Simon
Cool,
Francis
wyffels
In BIOSYSTEMS ENGINEERING
2023
Surgical phase duration in robot-assisted partial nephrectomy : a surgical data science exploration for clinical relevance
Pieter
De Backer,
Maria
Peraire Lores,
Meret
Demuynck,
Federico
Piramide,
Jente
Simoens,
Tim
Oosterlinck,
Wouter
Bogaert,
Chi Victor
Shan,
Karel
Van Regemorter,
Aube
Wastyn,
Enrico
Checcucci,
Charlotte
Debbaut,
Charles
Van Praet,
Rui
Farinha,
Ruben
De Groote,
Anthony
Gallagher,
Karel
Decaestecker,
Alexandre
Mottrie
In DIAGNOSTICS
2023
BIBLIO
Abstract
(1) Background: Surgical phases form the basic building blocks for surgical skill assessment, feedback, and teaching. The phase duration itself and its correlation with clinical parameters at diagnosis have not yet been investigated. Novel commercial platforms provide phase indications but have not been assessed for accuracy yet. (2) Methods: We assessed 100 robot-assisted partial nephrectomy videos for phase durations based on previously defined proficiency metrics. We developed an annotation framework and subsequently compared our annotations to an existing commercial solution (Touch Surgery, Medtronic (TM)). We subsequently explored clinical correlations between phase durations and parameters derived from diagnosis and treatment. (3) Results: An objective and uniform phase assessment requires precise definitions derived from an iterative revision process. A comparison to a commercial solution shows large differences in definitions across phases. BMI and the duration of renal tumor identification are positively correlated, as are tumor complexity and both tumor excision and renorrhaphy duration. (4) Conclusions: The surgical phase duration can be correlated with certain clinical outcomes. Further research should investigate whether the retrieved correlations are also clinically meaningful. This requires an increase in dataset sizes and facilitation through intelligent computer vision algorithms. Commercial platforms can facilitate this dataset expansion and help unlock the full potential, provided that the phase annotation details are disclosed.
Trends and challenges for sign language recognition with machine learning
J.
Fink,
Mathieu
De Coster,
Joni
Dambre,
B.
Frénay
In ESANN 2023, 31st European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, Proceedings
2023
BIBLIO
Abstract
Research in natural language processing has led to the creation of powerful tools for individuals, companies... However, these successes for written languages have not yet affected signed languages (SLs) to the same extent. The creation of similar tools for signed languages would benefit deaf, hard of hearing, and hearing people by making SL content, learning, and communication more accessible for everyone. SL recognition and translation are related to AI, but require collaboration with linguists and stakeholders. This paper describes related challenges from an AI researcher’s point of view and summarizes the state of the art in these domains
UnfoldIR : tactile robotic unfolding of cloth
Remko
Proesmans,
Andreas
Verleysen,
Francis
wyffels
In IEEE ROBOTICS AND AUTOMATION LETTERS
2023
BIBLIO
Abstract
Robotic unfolding of cloth is challenging due to the wide range of textile materials and their ability to deform in unpredictable ways. Previous work has focused almost exclusively on visual feedback to solve this task. We present UnfoldIR ("unfolder"), a dual-arm robotic system relying on infrared (IR) tactile sensing and cloth manipulation heuristics to achieve in-air unfolding of randomly crumpled rectangular textiles by means of edge tracing. The system achieves > 85% coverage on multiple textiles of different sizes and textures. After unfolding, at least three corners are visible in 83.3 up to 94.7% of cases. Given these strong "tactile-only" results, we argue that the fusion of both tactile and visual sensing can bring cloth unfolding to a new level of performance.
Users’ perspectives on value awareness in social robots
Giulio Antonio
Abbo,
Tony
Belpaeme
In Proceedings of the 1st Workshop on Perspectives on Moral Agency in Human-Robot Interaction
2023
2022
'Cool glasses, where did you get them?' : generating visually grounded conversation starters for human-robot dialogue
Ruben
Janssens,
Pieter
Wolfert,
Thomas
Demeester,
Tony
Belpaeme
In PROCEEDINGS OF THE 2022 17TH ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION (HRI '22)
2022
BIBLIO
Abstract
Visually situated language interaction is an important challenge in multi-modal Human-Robot Interaction (HRI). In this context we present a data-driven method to generate situated conversation starters based on visual context. We take visual data about the interactants and generate appropriate greetings for conversational agents in the context of HRI. For this, we constructed a novel open-source data set consisting of 4000 HRI-oriented images of people facing the camera, each augmented by three conversation-starting questions. We compared a baseline retrieval-based model and a generative model. Human evaluation of the models using crowdsourcing shows that the generative model scores best, specifically at correctly referencing visual features. We also investigated how automated metrics can be used as a proxy for human evaluation and found that common automated metrics are a poor substitute for human judgement. Finally, we provide a proof-of-concept demonstrator through an interaction with a Furhat social robot.
A comparative analysis on genome pleiotropy for evolved soft robots
Dries
Marzougui,
Matthijs
Biondina,
Francis
wyffels
In Faculty of Engineering and Architecture Research Symposium 2022 (FEARS 2022), Abstracts
2022
BIBLIO
Abstract
Biological evolution shapes the body and brain of living creatures together over time. By contrast, in evolutionary robotics, brain-body co-optimization remains challenging. Conflicting mutations cause dissociation between morphology and control, which leads to premature convergence. Recent works have proposed algorithmic modifications to mitigate the impact of conflicting mutations. However, the importance of genetic design remains underexposed. Current approaches are divided between a single, pleiotropic genetic encoding and two isolated encodings representing morphology and control. This design choice is commonly made ad hoc, causing a lack of consistency for practitioners. To standardize this design, we performed a comparative analysis between these two configurations and two previously unexplored alternatives on a soft robot locomotion task.
A comparative analysis on genome pleiotropy for evolved soft robots
Dries
Marzougui,
Matthijs
Biondina,
Francis
wyffels
In PROCEEDINGS OF THE 2022 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION, GECCO 2022
2022
BIBLIO
Abstract
Biological evolution shapes the body and brain of living creatures together over time. By contrast, in evolutionary robotics, the co-optimization of these subsystems remains challenging. Conflicting mutations cause dissociation between morphology and control, which leads to premature convergence. Recent works have proposed algorithmic modifications to mitigate the impact of conflicting mutations. However, the importance of genetic design remains underexposed. Current approaches are divided between a single, pleiotropic genetic encoding and two isolated encodings representing morphology and control. This design choice is commonly made ad hoc, causing a lack of consistency for practitioners. To standardize this design, we performed a comparative analysis between these two configurations on a soft robot locomotion task. Additionally, we incorporated two currently unexplored alternatives that drive these configurations to their logical extremes. Our results demonstrate that pleiotropic representations yield superior performance in fitness and robustness towards premature convergence. Moreover, we showcase the importance of shared structure in the pleiotropic representation of robot morphology and control to achieve this performance gain. These findings provide valuable insights into genetic encoding design, which supply practitioners with a theoretical foundation to pursue efficient brain-body co-optimization.
A physically sound basis for polarization in machine-learning force fields
Maarten
Cools-Ceuppens,
Joni
Dambre,
Toon
Verstraelen
In Theory, and Computation of Energy Materials, Institute seminar at the Forschungszentrum Jülich's IEK-13, Abstracts
2022
BIBLIO
Abstract
Molecular dynamics simulations can make quantitative predictions of complex physical and chemical processes at the nanoscale, if a suitable model for the potential energy surface (PES) is available for the system of interest. In fact, there are multiple requirements for a PES model, before it can be considered suitable. It must be capable of describing all processes of interest, e.g. many force field models preclude chemical reactions, whereas most electronic structure methods can describe them. Also, the accuracy of the PES model must be sufficient to minimize the error on prediction outcomes. Finally, also the computational efficiency is an essential requirement, because long molecular dynamics runs are often needed to converge the phase-space sampling. Machine-learning force fields have undeniably reached unprecedented compromises between these requirements: they can be trained to mimic density functional theory (or even better) training data to arbitrary accuracy, while they are computationally much more efficient than electronic structure methods. Despite their popularity and promise, machine-learning force fields have an important limitation: they are inherently short-ranged. The restriction to short ranges has two origins: short real-space cutoff spheres and unfavorable scaling of the required training data for long-range interactions. The first origin is addressed, at least to some extent, in message-passing networks: through multiple message-passing iterations, information from beyond the cutoff distance is taken into account. The second origin is more fundamental and harder to tackle: the number of configurations of atoms at long distances is vast, making it impossible to generate relevant examples for all possibilities. Whenever long-range interactions matter, a physical model is unavoidable. Many physically motivated models for long-range electrostatics and polarization were developed for normal force fields, and were later combined with machine-learning potentials for short-range interactions. In this lecture, advantages and disadvantages of several approaches will be discussed, and a new framework, called electron machine-learning potential (eMLP), will be introduced.
A review of evaluation practices of gesture generation in embodied conversational agents
Pieter
Wolfert,
Nicole
Robinson,
Tony
Belpaeme
In IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS
2022
BIBLIO
Abstract
Embodied conversational agents (ECAs) are often designed to produce nonverbal behavior to complement or enhance their verbal communication. One such form of the nonverbal behavior is co-speech gesturing, which involves movements that the agent makes with its arms and hands that are paired with verbal communication. Co-speech gestures for ECAs can be created using different generation methods, divided into rule-based and data-driven processes, with the latter, gaining traction because of the increasing interest from the applied machine learning community. However, reports on gesture generation methods use a variety of evaluation measures, which hinders comparison. To address this, we present a systematic review on co-speech gesture generation methods for iconic, metaphoric, deictic, and beat gestures, including reported evaluation methods. We review 22 studies that have an ECA with a human-like upper body that uses co-speech gesturing in social human-agent interaction. This includes studies that use human participants to evaluate performance. We found most studies use a within-subject design and rely on a form of subjective evaluation, but without a systematic approach. We argue that the field requires more rigorous and uniform tools for co-speech gesture evaluation, and formulate recommendations for empirical evaluation, including standardized phrases and example scenarios to help systematically test generative models across studies. Furthermore, we also propose a checklist that can be used to report relevant information for the evaluation of generative models, as well as to evaluate co-speech gesture use.
A social robot activity for novice programmers
Zimcke
Staey,
Natacha
Gesquière,
Francis
wyffels
In RIE 2021, 12th International Conference on Robotics in Education, Proceedings
2022
BIBLIO
Abstract
We have developed an activity on social robots for learners of ages 11-14, as part of a broader AI and CT curriculum for secondary education. We thereby address some of the difficulties that teachers experience when teaching physical computing and challenges of CS education such as gender equality. During the project, learners will use a graphical programming language and simulator to design and implement their robot. Afterward, they can transfer their design to a physical robot.
Anchor pruning for object detection
Maxim
Bonnaerens,
Matthias
Freiberger,
Joni
Dambre
In COMPUTER VISION AND IMAGE UNDERSTANDING
2022
BIBLIO
Abstract
This paper proposes anchor pruning for object detection in one-stage anchor-based detectors. While pruning techniques are widely used to reduce the computational cost of convolutional neural networks, they tend to focus on optimizing the backbone networks where often most computations are. In this work we demonstrate an additional pruning technique, specifically for object detection: anchor pruning. With more efficient backbone networks and a growing trend of deploying object detectors on embedded systems where post-processing steps such as non-maximum suppression can be a bottleneck, the impact of the anchors used in the detection head is becoming increasingly more important. In this work, we show that many anchors in the object detection head can be removed without any loss in accuracy. With additional retraining, anchor pruning can even lead to improved accuracy. Extensive experiments on SSD and MS COCO show that the detection head can be made up to 44% more efficient while simultaneously increasing accuracy. Further experiments on RetinaNet and PASCAL VOC show the general effectiveness of our approach. We also introduce 'overanchorized' models that can be used together with anchor pruning to eliminate hyperparameters related to the initial shape of anchors. Code and models are available at https://github.com/Mxbonn/anchor_pruning.
Assessment of code, which aspects do teachers consider and how are they valued?
Tom
Neutens,
Kris
Coolsaet,
Francis
wyffels
In ACM TRANSACTIONS ON COMPUTING EDUCATION
2022
BIBLIO
Abstract
In many countries, computer programming is becoming an integral part of the secondary school curriculum. However, many teachers, especially in the first years of Flemish secondary school, have limited experience with teaching programming. To improve their knowledge about programming, many different types of professional development programs have been proposed. Nevertheless, these programs mostly focus on technical skills and less on pedagogical skills. One aspect that is often overlooked in these programs is how teachers can assess code. To get insight into what teachers currently value when assessing code, we designed an experiment that analyzes the different aspects teachers consider during the assessment of code. During the experiment, the teachers (N=13) assess a set of programs from five different fictional learners. After the assessment, they participated in a structured interview giving us insight into the assessment process. We evaluated the transcripts of the interviews using deductive thematic analysis using a coding schema defining the different aspects of code that can be assessed. Additionally, we linked the assessment strategies of teachers to their teaching experience. Our results indicate that many teachers are unaware of the different concepts that can be part of the assessment of code which might lead to inaccurate or invalid feedback. Moreover, although our experimental group was too small to draw hard conclusions about the inter case results, our results indicate that the number of concepts considered by teachers seems to increase with experience. These results provide an initial insight into the code assessment practices of teachers and reveals interesting pathways for future research into the assessment of code.
Attention analysis of a sign language recognition task on the AUTSL dataset
Jeanne
Coppin,
Mathieu
De Coster,
Joni
Dambre
In 32nd Meeting of Computational Linguistics in The Netherlands, Abstracts
2022
BeCoS corpus : Belgian Covid-19 Sign language corpus : a corpus for training sign language recognition and translation
Vincent
Vandeghinste,
Bob
Van Dyck,
Mathieu
De Coster,
Maud
Goddefroy,
Joni
Dambre
In COMPUTATIONAL LINGUISTICS IN THE NETHERLANDS JOURNAL
2022
BIBLIO
Abstract
We are presenting the Belgian Federal COVID-19 corpus, nicknamed the BeCoS (Belgian Covid Sign language) corpus. It consists of the entire archive of official press conferences from the Belgian Federal Government concerning the COVID-19 pandemic. The speakers speak mostly in Dutch or French and occasionally in German, and nearly all speech is accompanied by a deaf signer who performs live interpreting from what is being said. We have preprocessed the corpus with speaker diarisation, applied Belgian Dutch ASR, and post-ASR language identification and punctuation prediction as well as signer diarisation, sign language identification and sign language keypoint recognition. The corpus is made publicly available.
Computational thinking in Flanders’ compulsory education
Natacha
Gesquière,
Francis
wyffels
In Proceedings of Sixth APSCE International Conference on Computational Thinking and STEM Education 2022 (CTE-STEM)
2022
BIBLIO
Abstract
To modernise education, the Flemish government defined new learning goals that take account of 21st-century competences, in particular on ‘digital competence and media literacy’, of which ‘computational thinking and acting’ is one of the building blocks. Since September 2019, ‘computational thinking and acting’ has been compulsory in secondary schools in Flanders. The basic concepts decomposition, abstraction, pattern recognition and generalisation, and algorithm have been pushed forward. A closer look at the newly defined learning goals clarified that ‘acting’ is about basic knowledge in computer science and computational thinking practices. The learning objectives show that ‘computational thinking and acting’ is best addressed interdisciplinary in a socially relevant context. Based on the abundant scientific literature on the subject, we found these goals to fit into an international perspective. To support teachers, we are adjusting the teaching materials we already developed on physical computing, programming, and AI
Distance-based delays in echo state networks
Stefan-Teodor
Iacob,
Matthias
Freiberger,
Joni
Dambre
In Intelligent Data Engineering and Automated Learning (IDEAL 2022)
2022
BIBLIO
Abstract
Physical reservoir computing, a paradigm bearing the promise of energy-efficient high-performance computing, has raised much attention in recent years. We argue though, that the effect of signal propagation delay on reservoir task performance, one of the most central aspects of physical reservoirs, is still insufficiently understood in a more general learning context. Such physically imposed delay has been found to play a crucial role in some specific physical realizations, such as integrated photonic reservoirs. While delays at the readout layer and input of Echo State Networks (ESNs) have been successfully exploited before to improve performance, to our knowledge this feature has not been studied in a more general setting. We introduce inter-node delays, based on physical distances, into ESNs as model systems for physical reservoir computing. We propose a novel ESN design that includes variable signal delays along the connections between neurons, comparable to varying axon lengths in biological neural networks or varying length delay lines in physical systems. We study the impact of the resulting variable inter-node delays in this setup in comparison with conventional ESNs and find that incorporating variable delays significantly improves reservoir performance on the NARMA-10 benchmark task.
Effective cloth folding trajectories in simulation with only two parameters
Victor-Louis
De Gusseme,
Francis
wyffels
In FRONTIERS IN NEUROROBOTICS
2022
BIBLIO
Abstract
Robotic cloth folding remains challenging for robots due to its highly deformable nature. In order to deal with these deformations, several strategies with varying amounts of adaptability have been proposed. We study robotic cloth folding by simulating and evaluating a trajectory search space with only two parameters: one parameter for the trajectory's height and one parameter to tilt it. We extensively analyzed folding a long-sleeved shirt in a high-fidelity simulator. To demonstrate that the trajectory is sufficiently adaptable and robust, we test several cloth shapes, cloth materials, an entire folding sequence and different folding velocities. We can deal with every folding scenario by tuning the two parameters correctly. The trajectories' simplicity and their robustness in simulation make them ideal candidates for future transfer to real-world robotic setups.
Event-based vision for improved classification accuracy in label-free flow cytometry
Muhammed Gouda Ahmed
Gouda,
Alessio
Lugnan,
Joni
Dambre,
Gerd
Branden,
Christoph
Posch,
Peter
Bienstman
In IEEE Benelux Photonics Chapter - Annual Symposium 2022
2022
Evolutionary co-optimisation of robot morphology and control : toward a seahorse-tail inspired robotic manipulator
Dries
Marzougui,
Dominique
Adriaens,
Francis
wyffels
In SEB Annual Conference 2022, Abstracts
2022
BIBLIO
Abstract
The design of robotic manipulators is confronted with a seemingly unavoidable trade-off between the level of flexibility versus strength. Unsurprisingly, nature has already come up with a solution and encapsulated it in a particular group of organisms: the seahorses. A seahorse’s body is completely enclosed in highly articulated body armour, made of similar and modular bony plates. Yet, the combination of regional variation in this skeletal anatomy and the soft tissue interconnecting the skeletal units provides a rigid, yet controllable and flexible tail. A seahorse tail thereby intriguingly integrates these two seemingly mutually exclusive properties. Although the seahorse can serve as a bio-inspired foundation for designing a novel type of robotic manipulator, its actual design is far from intuitive and hard to do manually. This raises the need for automated design methodologies. One domain, named Evolutionary Robotics, applies evolution as an optimisation technique to provide a holistic perspective to automated robot design. Evolutionary Robotics has already shown its potential in conventional rigid robotics and provided the necessary backbone for the optimisation of more recent alternatives such as soft robotics. Our work lies at the intersection of biology and robotics, as we aim to create a generic evolutionary brain-body co-optimisation framework. Using this framework, we pursue the automated design of a seahorse-tail-inspired robotic manipulator. From there on, the framework maintains applicability to other biomimetic experiments, next to providing a testbed for in-silico evolutionary experiments in biology.
Hardware-aware mobile building block evaluation for computer vision
Maxim
Bonnaerens,
Matthias
Freiberger,
Marian
Verhelst,
Joni
Dambre
In APPLIED SCIENCES-BASEL
2022
BIBLIO
Abstract
In this paper, we propose a methodology to accurately evaluate and compare the performance of efficient neural network building blocks for computer vision in a hardware-aware manner. Our comparison uses pareto fronts based on randomly sampled networks from a design space to capture the underlying accuracy/complexity trade-offs. We show that our approach enables matching of information obtained by previous comparison paradigms, but provides more insights into the relationship between hardware cost and accuracy. We use our methodology to analyze different building blocks and evaluate their performance on a range of embedded hardware platforms. This highlights the importance of benchmarking building blocks as a preselection step in the design process of a neural network. We show that choosing the right building block can speed up inference by up to a factor of two on specific hardware ML accelerators.
Have I got the power? Analysing and reporting statistical power in HRI
Madeleine E.
Bartlett,
C. E. R.
Edmunds,
Tony
Belpaeme,
Serge
Thill
In ACM TRANSACTIONS ON HUMAN-ROBOT INTERACTION
2022
BIBLIO
Abstract
This article presents a discussion of the importance of power analyses, providing an overview of when power analyses should be run in the context of the field of Human-Robot Interaction, as well as some examples of how to perform a power analysis. This work was motivated by the observation that the majority of papers published in the proceedings of recent HRI conferences did not report conducting a power analysis; an observation that has concerning implications for many conclusions drawn by these studies. This work is intended to raise awareness and encourage researchers to conduct power analyses when designing research studies using human participants.
Learning keypoints from synthetic data for robotic cloth folding
Thomas
Lips,
Victor-Louis
De Gusseme,
Francis
wyffels
In Faculty of Engineering and Architecture Research Symposium 2022 (FEARS 2022), Abstracts
2022
BIBLIO
Abstract
Robotic cloth manipulation is challenging due to its deformability, which makes determining its full state infeasible. However, for cloth folding, it suffices to know the position of a few semantic keypoints. Convolutional neural networks (CNN) can be used to detect these keypoints, but require large amounts of annotated data, which is expensive to collect. To overcome this, we propose to learn these keypoint detectors purely from synthetic data, enabling low-cost data collection. In this paper, we procedurally generate images of towels and use them to train a CNN. We evaluate the performance of this detector for folding towels on a unimanual robot setup and find that the grasp and fold success rates are 77% and 53%, respectively. We conclude that learning keypoint detectors from synthetic data for cloth folding and related tasks is a promising research direction, discuss some failures and relate them to future work. A video of the system, as well as the codebase, more details on the CNN architecture and the training setup can be found at https://github.com/tlpss/workshop-icra-2022-cloth-keypoints
Learning keypoints from synthetic data for robotic cloth folding
Thomas
Lips,
Victor-Louis
De Gusseme,
Francis
wyffels
In ICRA 2022 Workshop on Representing and Manipulating Deformable Objects
2022
BIBLIO
Abstract
Robotic cloth manipulation is challenging due to its deformability, which makes determining its full state infeasible. However, for cloth folding, it suffices to know the position of a few semantic keypoints. Convolutional neural networks (CNN) can be used to detect these keypoints, but require large amounts of annotated data, which is expensive to collect. To overcome this, we propose to learn these keypoint detectors purely from synthetic data, enabling low-cost data collection. In this paper, we procedurally generate images of towels and use them to train a CNN. We evaluate the performance of this detector for folding towels on a unimanual robot setup and find that the grasp and fold success rates are 77% and 53%, respectively. We conclude that learning keypoint detectors from synthetic data for cloth folding and related tasks is a promising research direction, discuss some failures and relate them to future work. A video of the system, as well as the codebase, more details on the CNN architecture and the training setup can be found at https://github.com/tlpss/workshop-icra-2022-cloth-keypoints.git.
Leveraging frozen pretrained written language models for neural sign language translation
Mathieu
De Coster,
Joni
Dambre
In INFORMATION
2022
BIBLIO
Abstract
We consider neural sign language translation: machine translation from signed to written languages using encoder–decoder neural networks. Translating sign language videos to written language text is especially complex because of the difference in modality between source and target language and, consequently, the required video processing. At the same time, sign languages are low-resource languages, their datasets dwarfed by those available for written languages. Recent advances in written language processing and success stories of transfer learning raise the question of how pretrained written language models can be leveraged to improve sign language translation. We apply the Frozen Pretrained Transformer (FPT) technique to initialize the encoder, decoder, or both, of a sign language translation model with parts of a pretrained written language model. We observe that the attention patterns transfer in zero-shot to the different modality and, in some experiments, we obtain higher scores (from 18.85 to 21.39 BLEU-4). Especially when gloss annotations are unavailable, FPTs can increase performance on unseen data. However, current models appear to be limited primarily by data quality and only then by data quantity, limiting potential gains with FPTs. Therefore, in further research, we will focus on improving the representations used as inputs to translation models.
Leveraging plant physiological dynamics using physical reservoir computing
Olivier
Pieters,
Tom
De Swaef,
Michiel
Stock,
Francis
wyffels
In SCIENTIFIC REPORTS
2022
BIBLIO
Abstract
Plants are complex organisms subject to variable environmental conditions, which influence their physiology and phenotype dynamically. We propose to interpret plants as reservoirs in physical reservoir computing. The physical reservoir computing paradigm originates from computer science; instead of relying on Boolean circuits to perform computations, any substrate that exhibits complex non-linear and temporal dynamics can serve as a computing element. Here, we present the first application of physical reservoir computing with plants. In addition to investigating classical benchmark tasks, we show that Fragaria x ananassa (strawberry) plants can solve environmental and eco-physiological tasks using only eight leaf thickness sensors. Although the results indicate that plants are not suitable for general-purpose computation but are well-suited for eco-physiological tasks such as photosynthetic rate and transpiration rate. Having the means to investigate the information processing by plants improves quantification and understanding of integrative plant responses to dynamic changes in their environment. This first demonstration of physical reservoir computing with plants is key for transitioning towards a holistic view of phenotyping and early stress detection in precision agriculture applications since physical reservoir computing enables us to analyse plant responses in a general way: environmental changes are processed by plants to optimise their phenotype.
Modeling electronic response properties with an explicit-electron machine learning potential
Maarten
Cools-Ceuppens,
Joni
Dambre,
Toon
Verstraelen
In JOURNAL OF CHEMICAL THEORY AND COMPUTATION
2022
BIBLIO
Abstract
Explicit-electron force fields introduce electrons or electron pairs as semiclassical particles in force fields or empirical potentials, which are suitable for molecular dynamics simulations. Even though semiclassical electrons are a drastic simplification compared to a quantum-mechanical electronic wave function, they still retain a relatively detailed electronic model compared to conventional polarizable and reactive force fields. The ability of explicit-electron models to describe chemical reactions and electronic response properties has already been demonstrated, yet the description of short-range interactions for a broad range of chemical systems remains challenging. In this work, we present the electron machine learning potential (eMLP), a new explicit electron force field in which the short-range interactions are modeled with machine learning. The electron pair particles will be located at well-defined positions, derived from localized molecular orbitals or Wannier centers, naturally imposing the correct dielectric and piezoelectric behavior of the system. The eMLP is benchmarked on two newly constructed data sets: eQM7, an extension of the QM7 data set for small molecules, and a data set for the crystalline beta-glycine. It is shown that the eMLP can predict dipole moments, polarizabilities, and IR-spectra of unseen molecules with high precision. Furthermore, a variety of response properties, for example, stiffness or piezoelectric constants, can be accurately reproduced.
Modular piezoresistive smart textile for state estimation of cloths
Remko
Proesmans,
Andreas
Verleysen,
Robbe
Vleugels,
Paula
Veske-Lepp,
Victor-Louis
De Gusseme,
Francis
wyffels
In SENSORS
2022
BIBLIO
Abstract
Smart textiles have found numerous applications ranging from health monitoring to smart homes. Their main allure is their flexibility, which allows for seamless integration of sensing in everyday objects like clothing. The application domain also includes robotics; smart textiles have been used to improve human-robot interaction, to solve the problem of state estimation of soft robots, and for state estimation to enable learning of robotic manipulation of textiles. The latter application provides an alternative to computationally expensive vision-based pipelines and we believe it is the key to accelerate robotic learning of textile manipulation. Current smart textiles, however, maintain wired connections to external units, which impedes robotic manipulation, and lack modularity to facilitate state estimation of large cloths. In this work, we propose an open-source, fully wireless, highly flexible, light, and modular version of a piezoresistive smart textile. Its output stability was experimentally quantified and determined to be sufficient for classification tasks. Its functionality as a state sensor for larger cloths was also verified in a classification task where two of the smart textiles were sewn onto a piece of clothing of which three states are defined. The modular smart textile system was able to recognize these states with average per-class F1-scores ranging from 85.7 to 94.6% with a basic linear classifier.
Multi-modal open world user identification
Bahar
Irfan,
Michael Garcia
Ortiz,
Natalia
Lyubova,
Tony
Belpaeme
In ACM TRANSACTIONS ON HUMAN-ROBOT INTERACTION
2022
BIBLIO
Abstract
User identification is an essential step in creating a personalised long-term interaction with robots. This requires learning the users continuously and incrementally, possibly starting from a state without any known user. In this article, we describe a multi-modal incremental Bayesian network with online learning, which is the first method that can be applied in such scenarios. Face recognition is used as the primary biometric, and it is combined with ancillary information, such as gender, age, height, and time of interaction to improve the recognition. The Multi-modal Long-term User Recognition Dataset is generated to simulate various human-robot interaction (HRI) scenarios and evaluate our approach in comparison to face recognition, soft biometrics, and a state-of-the-art open world recognition method (Extreme Value Machine). The results show that the proposed methods significantly outperform the baselines, with an increase in the identification rate up to 47.9% in open-set and closed-set scenarios, and a significant decrease in long-term recognition performance loss. The proposed models generalise well to new users, provide stability, improve over time, and decrease the bias of face recognition. The models were applied in HRI studies for user recognition, personalised rehabilitation, and customer-oriented service, which showed that they are suitable for long-term HRI in the real world.
On the pivotal role of water potential to model plant physiological processes
Tom
De Swaef,
Olivier
Pieters,
Simon
Appeltans,
Irene
Borra-Serrano,
Willem
Coudron,
Valentin
Couvreur,
Sarah
Garré,
Peter
Lootens,
Bart
Nicolaï,
Leroi
Pols,
Clément
Saint Cast,
Jakub
Šalagovič,
Maxime
Van Haeverbeke,
Michiel
Stock,
Francis
wyffels
In IN SILICO PLANTS
2022
BIBLIO
Abstract
Water potential explains water transport in the Soil-Plant-Atmosphere Continuum (SPAC), and is gaining interest as connecting variable between ‘pedo-, bio- and atmosphere’. It is primarily used to simulate hydraulics in the SPAC, and is thus essential for studying drought effects. Recent implementations of hydraulics in large-scale Terrestrial Biosphere Models (TBMs) improved their performance under water-limited conditions, while hydraulic features of recent detailed Functional-Structural Plant Models (FSPMs) open new possibilities for dissecting complex traits for drought tolerance. These developments in models across scales deserve a critical appraisal to evaluate its potential for wider use in FSPMs, but also in crop systems models (CSMs), where hydraulics are currently still absent. After refreshing the physical basis, we first address models where water potential is primarily used for describing water transport along the transpiration pathway from the soil to the leaves, through the roots, the xylem and the leaf mesophyll. Then, we highlight models for three ecophysiological processes, which have well-recognised links to water potential: phloem transport, stomatal conductance and organ growth. We identify water potential as the bridge between soil, root and shoot models, as the physiological variable integrating below- and above-ground abiotic drivers, but also as the link between water status and growth. Models making these connections enable identifying crucial traits for ecosystem resilience to drought and for breeding towards improved drought tolerance in crops. Including hydraulics often increases model complexity, and thus requires experimental data on soil and plant hydraulics. Nevertheless, modelling hydraulics is insightful at different scales (FSPMs, CSMs and TBMs).
Photonic reservoir computing for nonlinear equalization of 64-QAM signals with a Kramers-Kronig receiver
Sarah
Masaad,
Emmanuel
Gooskens,
Stijn
Sackesyn,
Joni
Dambre,
Peter
Bienstman
In 2022 EUROPEAN CONFERENCE ON OPTICAL COMMUNICATION (ECOC)
2022
BIBLIO
Abstract
Photonic reservoir computing is a promising processing solution for the equalization of fiber optic communication signals. We simulate the nonlinear equalization of 64 Quadrature-Amplitude Modulated signals using a fully passive space multiplexed reservoir. The system deploys direct detection using the recently proposed Kramers-Kronig receiver. (C) 2022 The Author(s)
Physically sound long-range interactions in machine-learning potentials
Maarten
Cools-Ceuppens,
Joni
Dambre,
Toon
Verstraelen
In Twelfth triennial congress of the World Association of Theoretical and Computational Chemists, WATOC 2020
2022
BIBLIO
Abstract
A suitable model for the potential energy surface (PES) is essential to any molecular simulation. Depending on the application of interest and the physics and chemistry at hand, one has to weigh different requirements, such as accuracy, ability to describe relevant processes, and computational efficiency. Machine-learning potentials have reached unprecedented compromises between these requirements: they can be trained to mimic density functional theory (or even better) training data to arbitrary precision, while they are computationally much more efficient than electronic structure methods. Despite their popularity and promise, machine-learning potentials have an important limitation: they are inherently short-ranged. The restriction to short ranges has two origins. First, it is common to use short real-space cutoff spheres to characterize the local environment of an atom. (To some extent, message-passing networks circumvent this limitation: through multiple message-passing iterations, information from beyond the cutoff distance is taken into account.) Second, the number of configurations of atoms at long distances is vast, making it impossible to generate relevant examples for all possibilities. Whenever long-range interactions matter, a physical model is unavoidable. Many physically motivated models for long-range electrostatics and polarization were developed for normal force fields, and were later combined with machine-learning potentials, which handle the short-range interactions. Advantages and disadvantages of several models for long-range interactions will be discussed, and a new framework, called electron machine-learning potential (eMLP), will be presented.
Progress, challenges and innovations of the SignON project
Vincent
Vandeghinste,
Mirella
De Sisto,
Dimitar
Shterionov,
Aoife
Brady,
Mathieu
De Coster,
Lorraine
Leeson,
Josep
Blat,
Frankie
Picron,
Marcello
Scipioni,
Aditya
Parikh,
Louis
Bosch,
John
O'Flaherty,
Joni
Dambre,
Jorn
Rijckaert
In 32nd Meeting of Computational Linguistics in The Netherlands, Abstracts
2022
Sign language translation : ongoing development, challenges and innovations in the SignON project
Dimitar
Shterionov,
Mirella
De Sisto,
Vincent
Vandeghinste,
Aoife
Brady,
Mathieu
De Coster,
Lorraine
Leeson,
Josep
Blat,
Frankie
Picron,
Marcello
Scipioni,
Aditya
Parikh,
Louis
Bosch,
John
O'Flaherty,
Joni
Dambre,
Jorn
Rijckaert
In Proceedings of the 23rd Annual Conference of the European Association for Machine Translation
2022
BIBLIO
Abstract
The SignON project (www.signon-project.eu) focuses on the research and development of a Sign Language (SL) translation mobile application and an open communications framework. SignON rectifies the lack of technology and services for the automatic translation between signed and spoken languages, through an inclusive, humancentric solution which facilitates communication between deaf, hard of hearing (DHH) and hearing individuals. We present an overview of the current status of the project, describing the milestones reached to date and the approaches that are being developed to address the challenges and peculiarities of Sign Language Machine Translation (SLMT).
Tactile interaction with a robot leads to increased risk-taking
Qiaoqiao
Ren,
Tony
Belpaeme
In SOCIAL ROBOTICS, ICSR 2022, PT1
2022
BIBLIO
Abstract
Tactile interaction plays a crucial role in interactions between people. Touch can, for example, help people calm down and lower physiological stress responses. Consequently, it is believed that tactile and haptic interaction matter also in human-robot interaction. We study if the intensity of the tactile interaction has an impact on people, and do so by studying whether different intensities of tactile interaction modulate physiological measures and task performance. We use a paradigm in which a small humanoid robot is used to encourage risk-taking behaviour, relying on peer encouragement to take more risks which might lead to a higher pay-off, but potentially also to higher losses. For this, the Balloon Analogue Risk Task (BART) is used as a proxy for the propensity to take risks. We study four conditions, one control condition in which the task is completed without a robot, and three experimental conditions in which a robot is present that encourages risk-taking behaviour with different degrees of tactile interaction. The results show that both low-intensity and high-intensity tactile interaction increase people’s risk-taking behaviour. However, low-intensity tactile interaction increases comfort and lowers stress, whereas high-intensity touch does not.
The effectiveness of dynamically processed incremental descriptions in human robot interaction
Christopher D.
Wallbridge,
Alex
Smith,
Manuel
Giuliani,
Chris
Melhuish,
Tony
Belpaeme,
Séverin
Lemaignan
In ACM TRANSACTIONS ON HUMAN-ROBOT INTERACTION
2022
BIBLIO
Abstract
We explore the effectiveness of a dynamically processed incremental referring description system using under-specified ambiguous descriptions that are then built upon using linguistic repair statements, which we refer to as a dynamic system. We build a dynamically processed incremental referring description generation system that is able to provide contextual navigational statements to describe an object in a potential real-world situation of nuclear waste sorting and maintenance. In a study of 31 participants, we test the dynamic system in a case where a user is remote operating a robot to sort nuclear waste, with the robot assisting them in identifying the correct barrels to be removed. We compare these against a static non-ambiguous description given in the same scenario. As well as looking at efficiency with time and distance measurements, we also look at user preference. Results show that our dynamic system was a much more efficient method—taking only 62% of the time on average—for finding the correct barrel. Participants also favoured our dynamic system.
Visual conversation starters for human-robot interaction
Ruben
Janssens,
Thomas
Demeester,
Tony
Belpaeme
In BNAIC/BeNeLearn 2022 : joint International Scientific Conferences on AI and Machine Learning, Abstracts
2022
Wavelength dimension in waveguide-based photonic reservoir computing
Emmanuel
Gooskens,
Floris
Laporte,
Chonghuai
Ma,
Stijn
Sackesyn,
Joni
Dambre,
Peter
Bienstman
In OPTICS EXPRESS
2022
BIBLIO
Abstract
Existing work on coherent photonic reservoir computing (PRC) mostly concentrates on single-wavelength solutions. In this paper, we discuss the opportunities and challenges related to exploiting the wavelength dimension in integrated photonic reservoir computing systems. Different strategies are presented to be able to process several wavelengths in parallel using the same readout. Additionally, we present multiwavelength training techniques that allow to increase the stable operating wavelength range by at least a factor of two. It is shown that a single-readout photonic reservoir system can perform with approximate to 0% BER on several WDM channels in parallel for bit-level tasks and nonlinear signal equalization. This even when taking manufacturing deviations and laser wavelength drift into account. (C) 2022 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement