CAOS 2025: THE 25TH ANNUAL MEETING OF THE INTERNATIONAL SOCIETY FOR COMPUTER ASSISTED ORTHOPAEDIC SURGERY
PROGRAM FOR TUESDAY, JUNE 17TH
Days:
previous day
next day
all days

View: session overviewtalk overview

09:00-10:30 Session 5: 3D Surgical Planning
09:00
Preliminary Clinical Outcome of Virtual Surgical Planning to Assist Reverse Total Shoulder Arthroplasty

ABSTRACT. Introduction Complex proximal humerus fractures in elderly are preferably treated with reverse total shoulder arthroplasty (RTSA). We hypothesized that patients with proximal humerus fractures benefit from virtual surgical planning (VSP) to overcome complication. Therefore, the aim was to investigate clinical outcome of reverse total shoulder arthroplasty with preoperative virtual surgical planning compared to treatment without planning. Methodology A cohort study was performed comparing two groups: RTSA with VSP vs. RTSA without VSP. Patients were included if planned for only a RSTA for an acute fracture within 28 days after trauma. The primary outcome measure was the range of motion (ROM) assessed for abduction, forward elevation and external rotation. The secondary outcome measures were complication rate, Patient Reported Outcome Measures (PROMs), operating time (minutes), and stem height of the prosthesis (mm). Results Thirty-four patients were included with 27 in the RSTA with VSP group and 7 in the RSTA without VSP. No significant differences were found between the groups for ROM, complication rate, PROMs. Significant differences were found in favor of RSTA with VSP for operating time and stem height Conclusion Preliminary data show some benefits using VSP in RTSA, but full data collection is needed to confirm positive effect on clinical outcome.

09:10
Statistical Shape Models for the planning of TKA surgery
PRESENTER: Anna Gounot

ABSTRACT. CT-based methods, such as robotic systems and patient-specific instrumentation (PSI), offer precise bone depiction, making them valuable for Total Knee Arthroplasty (TKA). However, they must be robust to the presence of cartilage, which is not easily visible on CT-scans. We present here a coupled bone and cartilage Statistical Shape Model (SSM) that infers cartilage solely from bone shape. Four models were trained and tested for healthy and pathological patients, for both femur and tibia. Cartilage prediction results show good adaptability to the pathology as well as similar accuracy compared to the inter-observer MRI manual segmentation variability. This solution could be integrated in the planning of TKA surgeries to improve CT-based PSI and robotic systems.

09:20
Variation in preoperative 3D planning for total hip arthroplasty across hip surgeons
PRESENTER: Ryo Higuchi

ABSTRACT. This study examined variations in preoperative 3D planning for total hip arthroplasty (THA) among surgeons and the effect of implant size regulation on these variations. CT data from 15 patients were used to create 3D plans by six surgeons, all following the same planning protocol with the same prosthesis (Accolade II, Stryker). The analysis focused on implant positioning and angles in two groups: a "free group," where implant size was chosen without restriction, and a "regulation group," where implant size was fixed based on the actual surgical implant. In the free group, implant sizes were generally consistent across surgeons, differing by one size at maximum. However, there was considerable variability in implant positioning and angles, especially in stem anteversion, with differences up to 11°. Similar variation was observed in the regulation group, suggesting that controlling implant size did not reduce positional or angular discrepancies. Previous studies demonstrated high concordance between preoperative planning and actual surgery by the same surgeon, but this study found significant variability in preoperative planning across different surgeons. These results suggest that individual surgeons should perform their own preoperative planning to minimize variations and ensure that the plan aligns with their surgical objectives. In conclusion, this study indicates considerable inter-surgeon variability in THA planning, which was not influenced by implant size regulation. Results indicate the importance of personalized preoperative planning by the operating surgeon.

09:30
Computerised restoration of centre of rotation in primary total hip arthroplasty
PRESENTER: Johann Henckel

ABSTRACT. Total hip arthroplasty (THA) is often needed to relieve pain and improve mobility in patients with osteoarthritis. Restoration of the centre of rotation (CoR) allows for an improved kinematic of the hip, including offset and leg length. The contralateral side is commonly used as a guide to aid positioning of the implant. However, it can often be diseased or replaced. Statistical shape modelling (SSM) can help guide CoR restoration and implant positioning without the need for a contralateral healthy side by predicting anatomical information that is missing in diseased anatomies. This retrospective cohort study involved 50 patients (23 females and 27 males) who underwent primary hip arthroplasty. An SSM was built on 100 healthy hemipelvises and used to virtually reconstruct the native pelvic morphology for all 50 test cases. The SSM-based reconstructions were then compared to the computerised tomography (CT)‐based models. The outcome measure was the difference in CoR between the SSM-based reconstruction and 1) the diseased hip (severity of the defect), 2) the contralateral side (sense check). The median (interquartile range [IQR], min to max) difference in CoR between the diseased hips and their SSM-based reconstructions was found to be 2 mm (1 to 4 mm, 0 to 10 mm) in the medial-lateral ML direction, 3 mm (1 to 3 mm, -1 to 7 mm) in the inferior-superior IS direction and 2 mm (1 to 4 mm, 0 to 8 mm) in the anterior-posterior AP direction. The median difference in CoR between the healthy contralateral hips and the SSM-based reconstructions was 2 mm (1 to 3 mm, 0 to 4 mm) ML, 2 mm (1 to 3 mm, 0 to 4 mm) IS and 2 mm (IQR: 1 to 2 mm, 0 to 5 mm) AP. This is the first study to apply an SSM to patients who underwent primary hip arthroplasty. As SSM allows for an optimal reconstruction of the native anatomy regardless of the presence of a healthy contralateral side, it is an important tool to aid preoperative planning of primary and complex primary THA.

09:40
A Pre-Operative Planning Tool for Patient Specific Guide Stability Assessment: An Experimental Validation

ABSTRACT. This work describes the development and experimental evaluation of a software tool (OrthoGrasp) to predict the stability of patient specific guide (PSG) designs. The tool assists in the design of PSGs to ensure that they will provide the high level of stability required to achieve successful and accurate surgical drilling tasks. OrthoGrasps adapts robotic grasping theory to analyse potential PSG designs by treating each point of contact between the guide and the host bone as a force-torque wrench that can be used to calculate stability metrics. The efficacy of OrthoGrasp was evaluated in this study by conducting a series of force-to-dislocation experiments for a range of glenoid anatomies and associated PSG designs that were analysed using OrthoGrasp. The OrthoGrasp derived Least Resisted Wrench (LRW) and Volume Of the Polytope (VOP) metrics, adapted from the robotic grasping literature were then compared to the experimental results through the use of Spearman Rank Correlation analysis. The results demonstrated that the VOP metric was in good agreement with experimental results and thus could be used to predict the overall stability of a PSG design, but the LRW metric had poorer predictive value. In summary, this work demonstrated the potential of the OrthoGrasp pre-operative planning tool to objectively analyse and rank potential PSG designs to ensure that surgeons are provided with guides that stably fit their patients to assist in achieving optimal surgical accuracy.

09:50
Identifying Mini-Plate Configurations with High Predicted Bone Union Propensity for Mandibular Reconstruction Surgery
PRESENTER: Merlin Bettin

ABSTRACT. Mandibular reconstruction is essential for restoring function and aesthetics in patients with jaw defects. This study evaluates different miniplate configurations to optimize bone union propensity (BUP) using physics-based simulation and finite element modeling. Various plate placements and screw configurations, covering a total of 10 cases, were tested to assess their impact on strain energy distribution and bone healing potential. The results indicate that miniplates with four screws provide superior stability and that higher placement enhances fixation. These findings contribute to refining patient-specific reconstruction strategies and improving surgical outcomes.

10:00
Systematic Rule-Based Regional Radiologic Classification of Traumatic Pelvic Ring Fractures: An Observer Variability Study
PRESENTER: Roey Ben Yosef

ABSTRACT. Purpose: The Young-Burgess pelvic ring classification system is commonly used for the classification of these fractures for treatment planning. In the emergency room, it is performed on a pelvic AP radiograph using general guidelines, whose results may vary between observers and may not be explainable. We aimed to validate a new rule-based regional anatomical system for systematic, explainable classification that is amenable to automation.

Methods: The rule-based pelvic regions system divides each pelvic radiograph into 11 distinct, partially overlapping pelvic regions. Each region is independently evaluated for radiographic findings – normal or injured. The Young-Burgess class is then determined with rules that combine the pelvic region evaluations. Fifty pelvic radiographs were evaluated and classified into one of the Young-Burgess classes by three experienced orthopedic trauma surgeons. Each radiograph was assessed twice in separate sessions: once as a full image (Gestalt) and once with the pelvic regions only, presented in random locations to avoid providing spatial cues (Per-region). Inter-observer agreement was quantified with weighted kappa scores.

Results: The Gestalt and the Per-region evaluations had comparable inter-observer agreement, with mean weighted kappa scores of 0.46 and 0.47 (ranges 0.40-0.61 and 0.39-0.56), respectively. The Per-region approach had slightly more consistent scores across different observer pairs.

Conclusion: Performing Young-Burgess pelvic ring injury classification with a rule-based regional anatomical system yields observer agreement scores comparable to conventional whole-image evaluation. This suggests that machine learning methods using the new system may achieve results similar to human experts and provide more transparent and interpretable results than whole-image black box methods.

10:10
Can we Predict Robotic-Assisted Surgical Plans in Total Knee Arthroplasty? A Machine Learning Story
PRESENTER: Alexander Orsi

ABSTRACT. The ability to predict robotic-assisted surgical plans will enhance surgical planning and potentially patient outcomes in orthopaedics. This study explores using machine learning to predict OMNIBotics plans by developing individual models for each planning parameter. A dataset of over 11,500 de-identified log files from robotic-assisted total knee arthroplasty’s (TKA) performed with the OMNIBotics system was used to train machine learning models to predict surgeon selected intraoperative planning parameters. Separate models were developed for each tibial and femoral planning parameter, incorporating preoperative kinematics, native geometries, and initial balance assessment data. Eight models were trained for the following parameters: tibial coronal angle, tibial medial resection thickness, tibial slope angle, femoral coronal angle, femoral distal medial resection thickness, femoral axial rotation angle, femoral posterior medial resection thickness, and femoral flexion angle. For model training, XGBoost was implemented with hyperparameter tuning to optimize predictive performance. Regression techniques were used for prediction, and the models were evaluated using several metrics: Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and percent inside 1.5° or 1.5 mm. Of the 8 models, the greatest RMSE was 1.7, and the greatest MAE was 1.4. Six of the 8 models had 75% or greater prediction accuracy within 1.5, with the worst being 61%. The results highlight the effectiveness of machine learning in predicting OMNIBotics planning parameters and its potential clinical applications. This proof of concept will be refined and integrated into the OMNIBotics software. With increased data collected and further model tuning its performance will only improve.

10:20
3D alignment analysis principles: an international Delphi consensus study
PRESENTER: Quinten Veerman

ABSTRACT. Introduction 3D bone models are increasingly adopted for leg alignment analysis, but there is substantial variability in the methods and underlying principles used to derive axes and joint orientations from 3D bone models. Therefore, the purpose was to reach consensus on a structured framework for standardized 3D leg alignment analysis based on 3D bone models. Methodology A Delphi study was performed in four rounds. Rounds 1 and 2 involved a steering and rating group that developed 31 statements based on principles preserving the complexity of 3D anatomical structures, identified through a systematic review. These statements encompassed deriving joint centres and joint orientations, and defining coordinate systems using 3D bone models. In Rounds 3 and 4, an international panel of experts, evaluated these statements. Consensus was defined as ≥80% agreement. Results Of the 31 statements, 26 achieved consensus in Round 3. Five statements were refined and subsequently all achieved consensus in Round 4. Experts agreed on utilising all available relevant surface data to define joint centres, joint orientations, and individual femoral and tibial coordinate systems alongside a combined leg coordinate system, and adopting central 3D axes for femoral version and tibial torsion. Conclusion This international Delphi consensus study provides a structured framework for a standardized 3D leg alignment analysis based on 3D bone models. By utilizing all relevant surface data, this framework provides a more accurate representation of joint geometries compared to traditional landmark-based methods. Future research should focus on validating the methods adhering to these principles in diverse clinical settings.

11:00-12:30 Session 6: Patient-Specific Treatment & Personalized Health
11:00
Driven by Innovation - Developments in CAS
11:25
Utility of muscle cross-sectional area in screening whole-body skeletal muscle mass loss
PRESENTER: Sotaro Kono

ABSTRACT. The psoas muscle's cross-sectional area (CSA) at the third lumbar vertebra (L3) is commonly used to predict skeletal muscle mass (SMM) loss, a marker of sarcopenia. However, the predictive capability of other muscle CSAs remains unclear. This study aimed to determine whether lower extremity muscle CSAs can screen for SMM loss, evaluated by skeletal muscle index (SMI), and assess if they outperform the psoas CSA in detecting SMI loss. Fifty-nine patients who underwent CT imaging of the lower extremities and a whole-body DXA scan were retrospectively analyzed. Using a deep-learning model, CSAs of the psoas, gluteal muscles, thigh muscles, and lower leg muscles were measured at various levels, and their diagnostic performance in predicting sarcopenia (SMI < 5.4 kg/m2) was evaluated using the receiver operating characteristic curve analysis. Then, the area under the curve (AUC) of each CSA was compared. The psoas CSA at L3 showed an AUC of 0.80, whereas the AUC for the gluteal muscles measured at the anterior superior iliac spine was 0.93, thigh muscle CSA measured at the distal thigh level was 0.93, and the lower leg muscle CSA measured at the proximal level was AUC = 0.82. No significant difference was found between the CSAs in detecting sarcopenia. In conclusion, our study showed the potential of muscle CSA of the hip, thigh, and lower leg to detect SMI loss, indicating that CT imaging acquired for other purposes may be used to screen for sarcopenia.

11:35
Computational Modelling Shows a Clinically Significant Difference Between the Internal and External Proximal Femoral Canal Shape
PRESENTER: Johann Henckel

ABSTRACT. Three-dimensional (3D) surgical planning of total hip arthroplasty (THA) uses the femoral neck version (FNV) to plan femoral stem orientation. However, this has proven to be an inaccurate guide, and no implant manufacturer can accurately predict the femoral stem version of their uncemented stems. We aimed to better understand the shape differences between the external surface of the proximal femur and the internal femoral canal using statistical shape modelling (SSM). We generated sex-specific SSMs of the external anatomy of the proximal femur and of the internal femoral canal (totaling 4 models) using 80 preoperative CT scans of patients with osteoarthritis (40 male, 40 female). The anatomical modes of variation between the models were compared to study the shape differences. The main modes of variation in both the external and internal shape were: 1) size; 2) version; 3) movement towards and away from the midline of the body and; 3) prominence of the greater trochanter. We found that the principal component which demonstrated FNV did not correspond with the principal component which revealed intramedullary femoral version. This illustrates that the internal femoral canal shape does not follow the trends and patterns of the external surface and hence, the FNV should not be used. This study shows how SSM can be applied to explore differences in the shape of the proximal femur and femoral canal. This information may be used to aid preoperative prediction of femoral stem version.

11:45
Targeting Functional Deficits: Associations Between Distal Femur Morphology and Passive and Dynamic Frontal Plane Knee Kinematics in Arthroplasty Patients for Personalized Robotic Surgery.
PRESENTER: Nadim Ammoury

ABSTRACT. Restoring knee function to pre-diseased levels after arthroplasty remains challenging, as common surgical approaches do not easily account for the variability in joint-level function, leaving some patients with unmet functional expectations. Robotic-assisted knee arthroplasty (RAKA) enhances surgical precision and accuracy, but opportunities remain to better consider patient-specific morphological and anatomical variability and its influence on both passive and dynamic joint function in optimizing surgical decisions for the individual. This study examines the relationships between distal femur morphology, joint alignment, intraoperative passive knee kinematics, and active kinematics during walking to inform tailored knee arthroplasty surgical protocols. Forty patients with end-stage knee osteoarthritis participated to date. Pre-operative gait kinematics were captured using markerless motion analysis. Passive kinematics were recorded intraoperatively under varus-valgus stress conditions with a robotic system. Morphological variables were measured on distal femurs modeled from computed tomography images to which principal component analysis was applied to reduce dimensionality and identify key morphometric shapes among this patient population. PC1, characterized by wider femurs with elevated anterior condyles, was correlated with higher mean knee adduction angles during gait. PC2, reflecting longer femurs with flatter anterior condylar grooves, correlated with greater frontal plane variability during gait and higher passive angular movement under varus stress at 10° flexion. These results highlight the influence of femoral morphology on knee mechanics and underscore the potential of integrating anatomical and morphometric variability into RAKA protocols to target functional outcomes. Continued exploration of these relationships could lead to improved post-arthroplasty functional outcomes tailored to individual patient needs.

11:55
Impact of the Individual Femoral Degree of Freedom on the Restoration of the Trochlea – A Sensitivity Analysis
PRESENTER: Laurent Angibaud

ABSTRACT. Objective Alignment philosophy in total knee arthroplasty (TKA) continuously evolving, with a focus often on the femorotibial joint at the potential detriment of the patellofemoral joint. This study assesses the impact of each degree of freedom (DOF) on the femoral component trochlea.

Material & Methods Four TKAs were performed using an enabling technology on four cadaveric specimens. The native femoral trochlea was mapped before and after implantation of the femoral component. Point clouds of native trochlea and the femoral component were exported to a computer-aided design software to generate mesh surfaces. Cross sections of interest were established to characterize the anteroposterior (AP) and mediolateral (ML) offsets between native and prosthetic trochlea. The femoral component was digitally translated/rotated in 2mm/° increments along/around each DOF. For each simulated position, the AP and ML offsets were measured, and a sensitivity factor (SF) was calculated as the magnitude of the linear regression between increments and offsets.

Results AP and ML translation directly influenced AP and ML offsets, respectively, with SF values near 1. Flexion/extension and axial translation moderately impacted AP offset, while varus/valgus and axial rotation showed minimal impact. Impact signatures were consistent across specimens.

Conclusion This study highlights the unique impact of each DOF on the trochlea restoration and surgeons should be aware of this finding. For instance, flexing the femoral component to adjust the flexion gap affects the AP position of the prosthetic trochlea. In this regard, enabling technology could guide management of both patellofemoral and femorotibial joints in TKA.

12:05
Predicting CT-based Coronal Plane Knee Phenotype Parameters using Imageless Navigation and Machine Learning
PRESENTER: Alexander Orsi

ABSTRACT. Coronal plane alignment of the knee (CPAK) categorizes knee phenotypes, using joint line obliquity (JLO) and arithmetic hip-knee-ankle angle (aHKA). These are both determined using the medial proximal tibial angle (MPTA) and lateral distal femoral angle (LDFA). CPAK is traditionally measured using long leg radiographs, but recently other modalities have been used such as computed tomography (CT), and imageless navigation. The aim of this study is to understand how accurately imageless navigation with wear assumptions measures CPAK relative to CT. Ninety-three TKAs performed using imageless navigation that also had preoperative CT were retrospectively reviewed. MPTA and LDFA were measured from both the preoperative CT and the intraoperative imageless navigation landmark data. JLO was calculated as MPTA+LDFA, and aHKA was calculated as MPTA – LDFA. Four articular cartilage wear assumption were applied to the imageless navigation data. One applied no wear correction, one used traditional preop coronal HKA thresholds, another used MPTA and LDFA thresholds, and lastly a machine algorithm was used. These were evaluated to determine which best approximated the measurements from the CT. The machine learning based wear assumptions had the lowest mean absolute error (MAE) for all CPAK parameters, with MAE ≤1.2° for MPTA and LDFA, and ≤1.8° for JLO and aHKA. Imageless navigation can measure MPTA and LDFA with a mean error of <1.2° compared to CT when using a machine learning model to predict cartilage wear. These results indicate that imageless navigation can be used to effectively measures CPAK parameters, achieving comparable results to a CT-based approach.

12:15
Evaluating the impact of robotic assisted personalized alignment in TKA on coronal plane alignment and associated functional outcomes
PRESENTER: Mark Maher

ABSTRACT. Traditional mechanical alignment (MA) in total knee arthroplasty (TKA) has long been the gold standard, targeting a neutral hip-knee-ankle (HKA) angle to enhance implant longevity. However, inconsistencies in functional outcomes and patient satisfaction have led to the emergence of personalized kinematic alignment strategies. This study evaluates the effects of robotic-assisted personalized alignment (R-PA) on coronal plane alignment (CPAK) and associated clinical outcomes. A retrospective review was conducted on 48 patients who underwent primary TKA using the CORI Surgical System with R-PA from October 2021 to August 2023. CPAK classifications were assessed pre- and postoperatively, and functional outcomes were measured using the Oxford Knee Score (OKS) at a minimum of 12 months. Preoperatively, CPAK phenotype I (varus & apex distal joint line) was predominant (65%), predominantly transitioning to phenotype IV (varus & neutral joint line) (40%) following R-PA. Most patients (71%) maintained their HKA alignment category, and all other HKA category deviation resulted in neutral alignment. Postoperative joint line obliquity (JLO) shifted toward neutrality in 83% of cases. Mean OKS was 35.8, with 87% achieving a patient-acceptable symptom state (PASS). R-PA in TKA results in predictable changes to CPAK, maintaining native constitutional alignment or transitioning to neutral aHKA without creating new varus/valgus alignment. This alignment strategy is associated with favorable functional outcomes scores at over 12 months follow up. Further large-scale and long-term randomized studies are warranted to assess implant survivorship and clinical outcomes.

13:30-15:00 Session 7: Artificial Intelligence in Orthopaedics
13:30
Application of Unsupervised Machine Learning to Classify Dynamic Knee Alignment in Total Knee Arthroplasty Using a Computer Assisted Orthopaedic Surgery System
PRESENTER: Laurent Angibaud

ABSTRACT. Objective Restoring normal knee kinematics is crucial for optimal outcomes in total knee arthroplasty (TKA). Traditionally, surgeons have aimed for neutral alignment; however, recent studies highlight the dynamic nature of knee alignment and deformity patterns throughout the flexion arc. This study introduces an automated dynamic knee alignment classification model utilizing varus & valgus (V&V) angles captured throughout full flexion arc with computer-assisted orthopaedic surgery (CAOS) system.

Material & Methods A retrospective review of 1336 cases performed by 11 surgeons was conducted. The CAOS system recorded kinematics, including V&V angles captured for full flexion arc (0° to 120°) were used. Features were extracted by transforming 2-dimensional input space (flexion and V&V) into a 12-dimensional feature space, where each dimension correspond to a V&V angle at a specific flexion angle. Cases were categorized based on the presence of flexion contracture, followed by k-means clustering to determine optimal number of dynamic alignment categories.

Results The model identified five distinct clusters: neutral, valgus, low varus, moderate varus, and high varus. Each cluster displayed unique centroid trajectories of V&V angles across the flexion arc. Surgeon-specific case distribution analysis revealed notable differences in deformity patterns, reflecting individual practice variations.

Conclusion This study demonstrates the utility of unsupervised machine learning in categorizing patient deformities using dynamic knee kinematics data. This model enables surgeons to better assess knee alignment by understanding the distinct patterns across the full flexion arc, rather than relying on static measurements.

13:40
Automatic Classification of Traumatic Pelvic Ring Fractures Based On a Rule-Based Regional Radiologic Classification Method
PRESENTER: Roey Ben Yosef

ABSTRACT. Purpose: The Young-Burgess pelvic ring classification system is commonly used to support treatment planning. In the emergency room, it is performed on pelvic AP X-rays using general guidelines, whose results may vary between observers and may not be explainable. We aimed to validate a novel computerized method for the automatic Young-Burgess classification of traumatic pelvic ring fractures on pelvic AP X-rays using a rule-based regional anatomical system that provides systematic and explainable classifications.

Methods: The method inputs a pelvic AP X-ray. It extracts 11 pelvic regions using a pelvic regions atlas and a deep-learning model, classifies each pelvic region as normal/injured with another deep-learning network, and computes the Young-Burgess class with rules that combine the pelvic region classifications. The outputs are the Young-Burgess class and the pelvic AP X-ray with overlayed color-coded pelvic regions. Evaluation was performed on 564 pelvic AP X-rays classified by a senior orthopedic trauma surgeon, on which 11 pelvic regions were delineated and classified as normal/injured. Two YOLOv8 deep-learning models were trained on 544 pelvic AP X-rays and tested on 20.

Results: Pelvic region computation yielded a mean F1-score of 0.99 (0.98-1.00). Pelvic region classification yielded a specificity of 1.00 and a mean sensitivity of 0.53 (0.20-1.00). The rule-based classification yielded a mean weighted kappa score of 0.47 and AUC score of 0.97.

Conclusion: Automatic Young-Burgess pelvic ring injury classification pelvic on AP X-rays with a rule-based regional anatomical system appears promising. Pelvic region computation is nearly perfect and rule-based classification is also excellent. Pelvic region classification requires further investigation.

13:50
Multi-Anatomy Simulations with Graph Neural Network
PRESENTER: Xintian Yuan

ABSTRACT. Real-time simulation of tissue deformation is paramount for developing and testing applications in surgical robotics, medical training and surgical planning. However, most current studies use traditional Finite Element Method (FEM) based simulation which is computational expensive and this limits its practicality in real-time settings. We present a novel simulation pipeline that integrates FEM-based modeling with Graph Neural Network (GNN), predicting the deformation of the lumbar spine and surrounding soft tissues under interaction with a robotic ultrasound probe. The GNN leverages a Graph Attention Network (GAT) architecture to predict deformation across mesh nodes which significantly outperforms the previous study PhysGNN while use 30-fold less time compared to physical simulation.

14:00
From Unlabeled Data to Clinical Applications: Foundation Models in Medical Imaging

ABSTRACT. This study explores the potential of self-supervised learning to address the challenges posed by limited labeled datasets in medical imaging. To this end, a curated collection of over 632,000 unlabeled X-ray images from intraoperative C-arm scans was used to pre-train various feature extraction backbones with the DINO framework. The resulting foundation models capture domain-relevant features that can be reused in a variety of medical imaging tasks. In order to demonstrate its adaptability, the pre-trained backbones were fine-tuned for three different downstream tasks: body region classification, metal implant segmentation, and screw object detection. This was achieved by training lightweight, task-specific head networks while keeping the backbones fixed, significantly reducing computational requirements and training time. The approach yielded remarkable performance metrics, revealing that vision transformers demonstrated proficiency in classification tasks, achieving an accuracy of 96.9%, while ResNet-based backbones exhibited exceptional performance in segmentation and detection tasks, as evidenced by a DICE score of 94.1%. The findings underscore the complementary strengths of different model architectures. Vision transformers have been shown to effectively capture global patterns, making them well-suited for tasks that require broad contextual understanding. Conversely, ResNet architectures demonstrate strengths in spatial precision, which leads to superior performance in pixel-level predictions. Nevertheless, the generalization of the proposed approach across a more diverse set of clinical scenarios remains to be validated. In summary, the study underscores the scalability and efficiency of self-supervised learning in developing adaptable foundation models and offers a streamlined path to integrating advanced deep learning solutions into clinical workflows.

14:10
Use of Regression and Machine Learning to Predict Spinopelvic Mobility 1 Year after Total Hip Arthroplasty
PRESENTER: Linden Bromwich

ABSTRACT. While the importance of pre-operative radiographic analysis of patients undergoing THA is well established, the relationship between pre- and post-operative spinopelvic parameters remains poorly understood. This study aims to investigate the predictability of important spinopelvic mobility parameters 1-year post-operatively using 2D and 3D radiographic measurements and patient demographic data taken pre-operatively. Pre- and 1-year post-operative radiographic data from 1,254 THA patients was combined with both multiple regression and gradient-boosting (XGBoost) models to assess the performance of predicting the following post-operative spinopelvic measurements; standing pelvic tilt, seated pelvic tilt, and lumbar flexion. For the dataset available, the multiple regression models performed comparably with the more complex XGBoost models. The best performing model was for prediction of standing pelvic tilt, achieving an RMSE of 3.2°, an R² of 0.78, an MAE of 2.4° and had 89.2% of predictions within ±5° of observed. With larger variability in their pre- and post-operative measurements, performance of the lumbar flexion, and particularly the seated pelvic tilt models were more modest, indicating there is considerable influence of unmodelled factors in the observed values. This study demonstrates that predictive models may be employed to investigate the changes in spinopelvic mobility parameters post-operatively. Work is ongoing to and improve the predictive capacity of the models by expanding the training/testing dataset, and further fine tune the models.

14:20
Expert Validation of CT-Based Machine Learning Model for Segmentation and Quantification of Deltoid Muscles for Shoulder Arthroplasty

ABSTRACT. The deltoid muscles play a crucial role in maintaining balanced arm function and enabling abduction following shoulder arthroplasty. Currently, pre-operative assessments of deltoid integrity rely primarily on visual inspection of medical images and subjective ratings. A recent work has shown accuracy of machine learning based pipeline to correctly segment and quantify characteristics of deltoid muscle in shoulder CT scans. In this paper, with the inputs from medical experts, we evaluated clinical acceptance and non-inferiority of the ML-based segmentations compared to the corrections provided by expert surgeons. The non-inferiority of the ML model was assessed by comparing model-generated masks to surgeons’ and inter-surgeon variations in metrics such as volume and fatty infiltration percentage. Expert validation showed 97% of masks to be clinically acceptable, with only 6% of ML generated masks requiring any major corrections. The median error in the volume and fatty infiltration measurements was <1% between the ML-generated masks and the masks corrected by surgeons. The non-inferiority analysis demonstrated no significant difference between the generated masks to surgeons’ and inter-surgeon variations (p<0.05).

14:30
Anatomical Shoulder Construction Conformity Evaluation, Based on Deep Learning Uncertainty Estimation

ABSTRACT. The use of deep learning (DL) in medical image analysis has significantly improved and accelerated the creation of 3D digital twins for preoperative planning in orthopedic surgeries. Despite these advancements, challenges remain in evaluating the reliability of DL outputs, necessitating manual verification that is time-consuming and prone to error. This work addresses these limitations by proposing a framework able of assessing the conformity of DL-generated digital twins based on uncertainty estimation. The framework evaluates predictions at four key steps: side prediction, volume of interest (VOI) localization, scapula segmentation, and landmark detection. Dedicated datasets containing conforming (CC) and non-conforming (NCC) constructions were established for each step to assess model performance. Conformity decisions were based on uncertainty scores derived from the models, using methods such as consistency checks, outlier detection, and statistical analyses. The results demonstrate high sensitivity and specificity in most tasks, particularly scapula segmentation, where both evaluation methods achieved 100% sensitivity and specificity up to 97.7%. These findings highlight the capability of uncertainty-based approaches to identify critical non-conformities effectively. This study emphasizes the importance of addressing DL model overconfidence and epistemic uncertainty in clinical applications. Future work will focus on larger dataset aiming at optimizing conformity thresholds to improve inspection robustness and accuracy, enhancing the safety and reliability of DL-driven medical workflows.

14:40
Machine learning-based automatic implant size prediction for total knee arthroplasty using bone dimensions

ABSTRACT. The accurate selection and positioning of implants are crucial to the success of joint replacement surgeries, including total knee arthroplasty (TKA). Early and precise prediction of implant sizes prior to surgery offers several benefits, including enhanced operative field preparation, improved inventory management, and optimized resource allocation and storage. Traditional templating methods rely on implant templates of varying sizes, which are used alongside radiographic images of the bone. While automatic templating, which eliminates manual intervention, can expedite the creation of computer- or robot-assisted surgical plans, our previous methods depend on anatomical landmark matching. This process is prone to human error in defining the landmark correspondeces in the bone model. In this paper, we propose a novel approach that moves beyond point correspondences and matching. Instead, our method leverages bone dimensions for implant size prediction. The approach automatically computes key dimensions such as the anterior-posterior width of the distal femur and the medial-lateral diameter of the proximal tibia from segmented bone models. Using these computed dimensions, we train two regression models that correlate the bone dimensions with the implant sizes selected by surgeons, resulting in a fully automated and scalable implant size prediction pipeline. Experimental evaluation on a dataset of 292 knee CT images demonstrates that the proposed method improves TKA implant size prediction accuracy.

14:50
Quantifying Segmentation Uncertainty to Evaluate the Quality of ML Generated Deltoid Masks in Shoulder Arthroplasty Patients

ABSTRACT. Despite the growing development of image-based machine-learning models, their integration into clinical practice remains limited. A significant barrier to adoption is the reliability of these models' predictions. This study demonstrates the use of uncertainty analysis to evaluate output of a CT-based model trained to segment deltoid muscles in shoulder arthroplasty patients. By quantifying uncertainty through metrics such as entropy, mutual information, and variance, we created 46 distinct image-level uncertainty scores for 108 good-quality and 100 low-quality segmentation outputs. In addition, these uncertainty scores were used to train a Gaussian Naïve Bayes model to identify low-quality cases, and the results were compared with those from single-metric thresholding. The results show that boundary 75 percentile entropy is the most predictive single uncertainly parameters (accuracy: 68%, recall: 68%, precision: 67%) while the trained model outperformed all single predictive metrics (accuracy: 78%, %, recall: 76%, precision: 78%). Our study indicates a uses case of utilizing uncertainty analysis to identify segmentation outputs that may require further manual correction, which will increase the trust, and potentially help for clinical adoption of ML segmentation models.

15:30-17:00 Session 8: Education & Training
15:30
Early Learning Curve in Robotic-Assisted Total Knee Arthroplasty
PRESENTER: David Putzer

ABSTRACT. This study aimed to assess the learning curve for robotic-assisted total knee arthroplasty (RA TKA) performed by three experienced surgeons, examining procedure duration, surgeon satisfaction, and confidence levels. A prospective study was conducted involving three senior arthroplasty surgeons, each performing 15 RA TKA procedures with the Triathlon Knee System and Robotic Arm Interactive Orthopedic (RIO) System. Data on preparation, cut-to-suture, and breakdown times were recorded. Statistical analysis was performed using GraphPad Prism. The average cut-to-suture time was 1 hour 38 minutes, with notable reductions in robotic-specific tasks as experience grew. Comparing the first and last five surgeries, navigation hardware mounting, landmarks registration, femur, and tibia registration, and bone preparation times decreased by up to 30% (p<0.001 to p=0.025). General instrument preparation time was reduced by 20% (p=0.004). This study highlights a significant learning curve for RA TKA, with enhanced efficiency and surgeon confidence by the fifteenth procedure. These results emphasize the potential for optimized workflows and provide valuable insights for training new robotic knee arthroplasty users.

15:40
Enhancing Orthopedic Surgical Training With Interactive Photorealistic 3D Visualization
PRESENTER: Roni Lekar

ABSTRACT. Surgical training integrates several years of didactic learning, simulation, mentorship, and hands-on experience. Challenges include stress, technical demands, and new technologies. Orthopedic education often uses static materials like books, images, and videos, lacking interactivity. This study compares a new interactive photorealistic 3D visualization to 2D videos for learning total hip arthroplasty. In a randomized controlled trial, participants (students and residents) were evaluated on spatial awareness, tool placement, and task times in a simulation. Results show that interactive photorealistic 3D visualization significantly improved scores, with residents and those with prior 3D experience performing better. These results emphasize the potential of the interactive photorealistic 3D visualization to enhance orthopedic training.

15:50
Effect of Simulator Fidelity on Trainee Satisfaction and Skill Acquisition in Arthroscopic Surgery Training for novices: A Randomized Controlled Trial

ABSTRACT. Aim: To evaluate the influence of simulator fidelity on skill acquisition and trainee satisfaction in arthroscopic knee surgery simulation training for surgical novices. Methods: A prospective, randomized, comparative study was conducted using a high-fidelity virtual reality simulator (ArthroSim) and a low-fidelity simulator (AZBOTS). First-year medical students (n=28) underwent 90 minutes of training with either simulator and were subsequently assessed for knee arthroscopy skills using ArthroSim, including tasks such as observing and probing the medial compartment and cruciate ligaments. Skill acquisition was measured by procedure completion time and the Arthroscopic Surgery Skill Evaluation Tool Global Rating Scale (ASSET score). Fifth-year medical students (n=109) evaluated trainee satisfaction via a 7-point Likert scale questionnaire assessing spatial judgment, hand-eye coordination, camera navigation, instrument handling, and knee arthroscopy training. Results: No significant differences in skill acquisition (ASSET scores or individual components) were observed between the two simulator groups. Similarly, no significant differences were found in total questionnaire scores or in spatial judgment, hand-eye coordination, and camera navigation. However, trainees using the high-fidelity simulator reported significantly higher satisfaction in instrument handling (p=0.024) and knee arthroscopy training (p=0.033). Conclusions: While skill acquisition did not differ significantly between high- and low-fidelity simulators after a single training session, the high-fidelity simulator significantly improved trainee satisfaction, particularly regarding instrument handling and the overall training experience. This suggests that simulator fidelity impacts training experience quality, emphasizing the importance of high-fidelity elements in surgical education for novices.

16:00
Evolving simulation-based education in trauma care: a user-perspective on implementation requirements
PRESENTER: Andreas Arnegger

ABSTRACT. This study explores the integration of simulation-based augmented reality (AR) education in trauma care, focusing on digital twins and computer simulations for interactive learning. Traditional case discussions in fracture treatment rely on retrospective analysis. In contrast, this approach allows participants to experiment with treatment strategies and analyze their effects using predictive analytics, enhancing surgical outcomes. The OSORA educational platform was deployed in trauma courses, utilizing the Ulm fracture healing model to simulate the bone tissue differentiation process. Participants could modify fracture management strategies and assess healing metrics such as interfragmentary movement and bone tissue formation. Interactive visualizations facilitated understanding of complex mechanobiological relationships, fostering a transition from passive to active learning. Feedback from 109 participants and faculty members indicated positive reception of the concept. Course participants appreciated the clarity of learning objectives and the engaging nature of digital twins in case discussions. Faculty highlighted the potential of the platform to reduce preparation workload through improved usability and asynchronous formats. However, challenges such as technical requirements for 3D visualizations and the need for faculty onboarding were noted. Future directions include extending the tool’s applications to cover the entire skeletal system, incorporating clinical planning software, and enhancing quality management through rigorous validation. Simulation-based education holds promise for improving trauma training, offering a risk-free environment to explore surgical outcomes and post-operative scenarios, ultimately bridging the gap between education and clinical practice.

16:10
AO’S NOVEL DIGITAL ENHANCED HANDS-ON SURGICAL TRAINING (DEHST) TECHNOLOGY IMPROVES THE SKILLS LEVEL OF NOVICES IN DISTAL INTRAMEDULLARY NAIL INTERLOCKING
PRESENTER: Jan Buschbaum

ABSTRACT. Purpose A new technology, Digitally Enhanced Hands-on Surgical Training (DEHST), was introduced to provide a safe, radiation-free environment for training surgical skills. DEHST is a modular, portable platform that includes a reduced model of a simulated intraoperative image intensifier, that simulates radiation-free X-ray images through an artificial X-ray engine. A DEHST training module was specifically designed to teach the demanding procedure of freehand distal interlocking of intramedullary nails. The goal of the study was to evaluate the proficiency of novices achieved through training with DEHST. Methods Ten novices (group DEHST) underwent five DEHST training sessions. Surgical performance was evaluated in a simulated operation using a real image intensifier and a real tibia nail and artificial bone model. The results were compared with those from another group of untrained novices (group novices, n=10) and expert surgeons (group expert, n=10) performing the same simulated procedure. Results The DEHST group showed improved results compared to the untrained novices, with a higher success rate (90% vs. 60%) and better performances in terms of time (414.7s vs. 623.4s, p=0.041) and accuracy of nail hole roundness (95.0% vs. 80.8%, p<0.001). Compared to the experts, DEHST participants showed similar results for the number of X-rays (26.0 vs. 22.5, p=0.281) and hole roundness accuracy (95.0% vs. 93.3%, p=0.087). However, the DEHST group took significantly longer to complete the task than the experts (414.7s vs. 256.1s, p=0.001), and one participant (10%) failed to hit the nail hole with the drill, while all experts succeeded. Conclusions DEHST has shown to be an effective tool for improving surgical skills for freehand distal interlocking. However, additional research is needed to assess its performance in real-life surgical settings.

16:20
Impact of Virtual Reality-Based Preoperative Planning on the Comprehension of Orthopedic Surgery Residents in Trauma Surgeries
PRESENTER: Haruki Fujikawa

ABSTRACT. This study examines the impact of virtual reality (VR)-based preoperative planning on the spatial comprehension of orthopedic surgery residents, focusing on trauma procedures. Twelve anonymized cases, including tibial plateau, tibial shaft, and femoral shaft fractures, were analyzed. Three orthopedic residents conducted preoperative planning using both traditional handwritten methods and VR-based approaches. Spatial comprehension, implant-bone compatibility, and usability were assessed using a modified Likert scale and the System Usability Scale (SUS) questionnaire. Results indicated that VR-based planning significantly enhanced spatial understanding and implant compatibility compared to conventional methods. However, no significant differences were observed in perceived procedural complexity or overall usability, with VR planning showing slightly lower SUS scores and greater interparticipant variability. A major advantage of VR planning was its ability to facilitate interaction with 3D models, such as color-coded fracture fragments and simulated reductions, which improved spatial comprehension. These findings align with prior research emphasizing VR’s benefits in enhancing anatomical understanding and surgical confidence. Limitations of this study include the small sample size, the absence of clinical outcome data, and the developmental stage of the VR software. Future studies should integrate haptic feedback and expand sample sizes to assess clinical efficacy. Despite these constraints, this study underscores the potential of VR-based planning as an effective educational tool for orthopedic residents, offering significant improvements in surgical preparation and spatial comprehension.

16:30
Learning curves of robot-assisted pedicle screw fixations based on the cumulative sum test
PRESENTER: Qi Zhang

ABSTRACT. Background: This study aimed to analyze the learning curve of robot-assisted (RA) pedicle screw fixation (PSF) through fitting the operation time curve based on the cumulative summation method. Methods: RA PSFs that were initially completed by two surgeons at the Beijing Jishuitan Hospital were analyzed retrospectively. Based on the cumulative sum of the operation time, the learning curves of the two surgeons were drawn and fit to polynomial curves. The learning curve was divided into the early and late stages according to the shape of the fitted curve. The operation time and screw accuracy were compared between the stages. Results: The turning point of the learning curves from Surgeons A and B appeared in the 18th and 17th cases, respectively. The operation time [150 (128, 188) minutes vs. 120 (105, 150) minutes, P=0.002] and the screw accuracy (87.50% vs. 96.30%, P=0.026) of RA surgeries performed by Surgeon A were significantly improved after he completed 18 cases. In the case of Surgeon B, the operation time (177.35 ± 28.18 minutes vs. 150.00 ± 34.64 minutes, P=0.024) was significantly reduced, and the screw accuracy (91.18% vs. 96.15%, P=0.463) was slightly improved after the surgeon completed 17 RA surgeries. Conclusions: After completing 17 to 18 cases of RA PSFs, surgeons can pass the learning phase of RA technology. The operation time is reduced afterward, and the screw accuracy shows a trend of improvement.

16:40
8 Years of Shoulder Navigation, an Intra-Operative Performance Study

ABSTRACT. This study aims to retrospectively evaluate the intra-operative performance of computer-assisted navigation (CAN) total shoulder arthroplasty (TSA) over years, analyzing temporal performance evolution and influencing factors.