News

Cross-Pilot Evaluation in MOTIVATE XR:  Building Evidence across Industrial Contexts 

Validating an emerging XR platform in real industrial settings is not only a technical challenge; it is also a human and organisational one. Within MOTIVATE XR, five industrial pilots operate in highly diverse environments, ranging from aerospace and aluminium manufacturing to energy distribution, home appliances, and human-robot hybrid manufacturing. Each pilot is embedded in a distinct socio-technical context shaped by its own workflows, safety requirements, training practices, and expectations towards immersive technologies. 

These differences matter. An XR training system that fits well within one industrial setting may require significant adaptation in another. For this reason, evaluating the MOTIVATE XR platform solely at the level of individual pilots would provide an incomplete picture. What is needed instead is a structured way to understand technological maturity, user experience, and technology acceptance across contexts, while remaining sensitive to their specific constraints. 

Against this backdrop, TU Delft led the cross-pilot evaluation within Task 7.8. The objective was to generate a harmonised and comparable evidence base capable of validating the beta version of the MOTIVATE XR platform and guiding the next development cycle. Rather than focusing on isolated pilot outcomes, the evaluation was explicitly designed to look across pilots, identifying common patterns, context-dependent dynamics, and shared opportunities for refinement. 

This blog post describes the process, methodology, and value of the cross-pilot evaluation. It explains why such an evaluation is essential in a multi-pilot XR project, how it was conducted in practice, and how it supports evidence-based decision-making as MOTIVATE XR progresses towards final validation. 

Why a Cross-Pilot Evaluation Matters

In multi-pilot innovation projects, evaluation activities are often conducted locally. Each pilot may use different instruments, criteria, or timelines, reflecting local priorities and operational realities. While these local evaluations can be informative, they also create fragmentation. Without a shared framework, it becomes difficult to answer project-level questions that are critical for continuation, refinement, and scale-up. 

In particular, fragmented evaluation makes it challenging to assess how mature the technology is across pilots, how user perceptions change once participants move from expectations to hands-on experience, and which challenges point to platform-level design priorities rather than context-specific issues. 

The cross-pilot evaluation in MOTIVATE XR was designed to address these challenges directly. Its purpose was not to rank pilots or to label outcomes as successes or failures. Instead, it aimed to create a shared reference frame for understanding technological readiness, user experience, and technology acceptance across highly diverse industrial environments. This approach enables meaningful comparison while avoiding simplistic conclusions that ignore contextual differences. 

HarmonisedTheory-Driven Evaluation Framework 

To ensure comparability and methodological robustness, TU Delft implemented a consistent, mixed-methods evaluation framework across all five pilots. The framework integrates technological assessment with human-centred evaluation and is grounded in established research models and standards. 

At a high level, the framework combines three complementary perspectives: 

  • Technological maturity, assessed through Technology Readiness Levels, to establish whether the platform is sufficiently stable and functional in relevant operational environments [1,2]. 
  • User experience and usability, capturing how the system is perceived and experienced during actual use, drawing on established usability and UX instruments [4–6]. 
  • Technology acceptance, examining how expectations, perceptions, and intentions evolve over time using the Technology Acceptance Model 3 (TAM3) [3]. 

 

Using validated instruments is essential for producing evidence that is not only descriptive but also interpretable and reusable across project phases. A structured evaluation timeline was defined to ensure that data were collected at comparable moments across pilots, allowing differences to be attributed to context and experience rather than to methodological inconsistencies. 

Evaluation Process and Timeline 

The evaluation followed both the technology and the user journey over time, rather than relying on a single snapshot. This longitudinal perspective is particularly important for XR technologies, where first impressions may differ substantially from experience-based judgements formed after training and practical use. 

Each pilot was assessed in terms of Technology Readiness Level at baseline and again at the end of the beta validation phase. TRL assessment provides a structured way to position a technology’s maturity, ranging from early conceptual development to demonstration in relevant environments [1,2]. Anchoring the evaluation in TRL ensured that user feedback was interpreted in light of the platform’s development stage, rather than against expectations of a fully mature system. 

Before any hands-on interaction, participants completed an Ex-Ante survey capturing expectations and pre-adoption perceptions. A pre-adoption version of TAM3 was used to assess perceived usefulness, perceived ease of use, and their antecedents based on scenarios or demonstrations of the platform [3]. This step established a baseline against which later experience-based perceptions could be compared. 

User perceptions were then tracked through two Ex-Post measurement points. An immediate post-training evaluation captured initial user experience and usability impressions using standardised instruments such as SUS and UEQ [4–6]. A delayed Ex-Post evaluation, conducted after practical use, assessed how acceptance and usage intentions evolved once users had time to integrate the system into their work context. This staged approach allows the evaluation to move beyond first impressions and capture more realistic, experience-based judgements. 

Highly Diverse Industrial Environments

A defining feature of the cross-pilot evaluation is the diversity of the industrial contexts involved. The five pilots represent markedly different training needs and operational realities. 

In the Aerospace Industry, XR is explored primarily as a tool for specialised and safety-critical training. Workflows are tightly regulated, training sessions are intensive, and accuracy is paramount. XR is therefore positioned as a complementary training solution used in focused sessions rather than in everyday routines. 

The Aluminium Industry involves physically demanding processes and complex machinery. Training focuses on operational procedures, safety, and process understanding. In this context, XR offers opportunities to visualise equipment and processes that are otherwise difficult to access during live operations. 

In the Energy Distribution Industry, training often concerns infrastructure, maintenance procedures, and risk management. XR can support scenario-based training for rare or hazardous situations, where real-world practice is costly or unsafe. 

The Home Appliance Industry represents a different training environment, characterised by production-line workflows, efficiency constraints, and shorter training cycles. Here, the integration of XR into existing routines poses distinct challenges related to usability, time constraints, and alignment with established practices. 

Finally, Human-Robot Hybrid Manufacturing involves collaborative work between human operators and robotic systems. Training in this context must support understanding of both technical systems and interaction dynamics. XR serves as a complementary tool within a broader training ecosystem rather than as a standalone solution. 

These differences underscore why a cross-pilot evaluation is both challenging and necessary. Rather than treating diversity as a limitation, the evaluation framework was designed to embrace contextual variation while maintaining comparability at the level of core constructs. 

What the Evaluation Delivers for MOTIVATE XR

The outcome of the cross-pilot evaluation is not a single score or ranking, but a clear, actionable diagnosis that supports project-level decision-making. The evaluation confirms that the MOTIVATE XR ecosystem has reached a level of maturity that enables meaningful user-centered assessment in real industrial environments. It demonstrates the value of combining technological indicators with human-centred evidence, rather than treating them in isolation. 

Importantly, the evaluation provides insight into how user expectations adjust after hands-on experience, helping partners understand adoption dynamics beyond initial enthusiasm. It also highlights where targeted refinements in usability, onboarding, and integration are most likely to have an impact, without reducing these insights to pilot-specific anecdotes. 

Because the same framework can be reused in subsequent phases, the evaluation also establishes a basis for longitudinal monitoring. This allows the project to track how improvements affect user experience and acceptance over time, supporting continuous learning rather than one-off validation. 

From Evidence to Targeted Iteration

The cross-pilot evaluation directly informs the next stages of MOTIVATE XR. Instead of relying on ad hoc feedback, partners can base refinement decisions on structured evidence collected consistently across pilots. This creates a feedback loop in which evidence from the beta phase guides prioritised development actions; improvements are re-evaluated using the same framework, and progress towards final validation can be assessed transparently. 

In this sense, the evaluation functions as an enabling mechanism for coordinating iteration across technical development, UX design, and adoption strategy. 

The cross-pilot evaluation conducted by TU Delft plays a central role in aligning MOTIVATE XR’s technical development with user experience and acceptance across highly diverse industrial environments. By applying a harmonised, theory-driven evaluation framework, the project has established a shared evidence base that supports informed decision-making and targeted refinement. 

In complex XR deployments, success depends not only on technological capability, but on how technologies are experienced, accepted, and integrated into everyday practice. The cross-pilot evaluation ensures that these dimensions are assessed systematically and comparably, strengthening MOTIVATE XR’s path towards final validation and future industrial adoption. 

[1] Mankins, J. C. Technology readiness levels: A white paper. NASA, Office of Space Access and Technology. (1995). 

[2] European Commission. Technology Readiness Levels (TRL). HORIZON 2020 – WORK PROGRAMME 2014-2015. (2014). 

[3] Venkatesh, V., & Bala, H. “Technology acceptance model 3 and a research agenda on interventions”, Decision Sciences, 39(2), 273–315 (2008). https://doi.org/10.1111/j.1540-5915.2008.00192.x 

[4] Brooke, J. (1996). SUS: A quick and dirty usability scale. In P. W. Jordan, B. Thomas, B. A. Weerdmeester, & I. L. McClelland (Eds.), Usability Evaluation in Industry (pp. 189–194). Taylor & Francis. 

[5] Brooke, J. “SUS: a retrospective”, Journal of Usability Studies, 8(2), 29-40 (2013). 

[6] Schrepp, M. “User Experience Questionnaire Handbook. All you need to know to apply the UEQ successfully in your project”, (2015). https://doi.org/10.13140/RG.2.1.2815.0245 

Authors

Delft Centre for Entrepreneurship

Nicola Franciulli is a PhD Candidate at the Delft Centre for Entrepreneurship. His research examines how Human-Centred Design can improve the acceptance and effective use of new technologies in mandatory adoption contexts, where use is required rather than voluntary. He approaches technology adoption as a value design problem, bridging organisational goals with users lived experiences. 

Johannes Gartner

Delft Centre for Entrepreneurship

Johannes Gartner is an Assistant Professor of Digital Entrepreneurship & Technology Management. His research focuses on the diffusion, adoption, and value creation of deep-tech innovations, with particular attention to advanced manufacturing contexts. His work has been published in leading journals, including TechnovationTechnological Forecasting & Social Change, and the Journal of Business Research. 

Share

Categories

Related News

As MOTIVATE XR progresses toward the development of a no-code, collaborative XR platform for industrial training and support, ensuring that the project’s IPR and Data...
The industrial "content bottleneck" is breaking. Discover how MOTIVATE XR is using Generative AI and no-code tools to transform thousands of pages of technical manuals...
Discover how BI-REX’s Ipazia HPC cluster accelerates AI and XR innovation in the MOTIVATE XR project, enabling safer, smarter industrial training....

Stay up to date

Subscribe to
MOTIVATE XR Newsletter

Subscription Form Homepage