Sydney Precision Data Science Centre

Data Science Insights

Edition 6, December 2025

Message from the Director - Professor Jean Yang

As we wrap up a busy year for the Sydney Precision Data Science Centre, I’m thrilled to share some of the highlights in this newsletter. From ground breaking research like GHIST, to celebrating our talented researchers and ECR initiatives, 2025 has been a year of growth, collaboration, and impact.

One of the biggest milestones was hosting the Australian Data Science Network (ADSN) Conference at the University of Sydney — a vibrant gathering that showcased the strength of our data science community and sparked important conversations about the future of AI.

A special congratulations to nine of our researchers for their success in securing Discovery Project funding! Thank you to everyone who contributed to these successes, and to our sponsors and partners for their continued support. Here’s to building on this momentum as we head into 2026.

 

Researcher Spotlight:
Dr Andy Tran

We are delighted to introduce Dr Andy Tran in this edition's researcher spotlight. Andy completed a PhD in statistical bioinformatics, developing methods using multiomics data for precision medicine. Andy is now an education-focused lecturer in the School of Mathematics and Statistics at the University of Sydney. He specialises in teaching large introductory data science courses, as well as project-based courses.

What excites you about data science?
Data science is such a powerful and versatile tool to guide decision making. In a world with so much information (and misinformation!), data science gives us the means to draw evidence-based insights to inform science and policy.

What is keeping you busy at the moment?
I'm being kept very busy with teaching a large course with around a thousand students. The amount of admin and bureaucracy that comes teaching is often unnoticed and underappreciated!

How long have you been working with the SPDS?
I have had the pleasure of witnessing the inception of SPDS when I was a PhD student with the group, and being able to watch its growth over the last few years. It's an honour to join the centre as a Research Leader focussing on education!

What’s next for you at the SPDS?
As my primary focus is in education, I would love to support SPDS to have a greater role in the training and development of future data scientists and promote data literacy and appreciation at a wider scale.

Where could we find you when you’re not working?
I'll probably be swimming in a pool or going for a walk!

👉 Visit Andy's online profile to find out more. 

 

Feature Method: GHIST

In this newsletter, we highlight GHIST, a new deep learning method that predicts single-cell gene expression from routinely collected histology images. Recently published in Nature Methods and showcasing the Centre’s collaborative culture, this work is a joint effort led by Jean Yang, Jinman Kim, and Ellis Patrick across the Precision Bioinformatics and Quantitative Technology clusters. The project was motivated by earlier work from PhD students Chuhan Wang and Adam Chan (published in Nature Communications, Jan 2025), who showed that high-resolution data markedly improves molecular prediction from histology. Building on this insight, postdoctoral researchers Helen Fu and Yue Cao combined their complementary strengths in computer vision and statistical bioinformatics to design and implement GHIST.

Why is GHIST important? Histology images are a foundational diagnostic tool in pathology and are routinely collected whenever tissue is examined in clinical care. By training on a small number of samples with matched histology and spatial transcriptomics data, GHIST learns how the visual features pathologists see correspond to molecular activity within cells. Once trained, GHIST can be applied across large patient cohorts, transforming routinely collected histology slides into spatial molecular maps that support new biological hypotheses and downstream analysis.

GHIST introduces several key capabilities: it is the first method to infer cell-level spatial transcriptomes directly from routine H&E images; it uses multitask deep learning to fuse information on morphology, cell identity and neighbourhood context; it operates flexibly across both cell-level and spot-level platforms such as Xenium and Visium to generate in-silico spatial omics atlases; and it enables scalable patient-level molecular profiling to support biomarker discovery, risk stratification, and treatment planning.
👉 Read the full paper in nature methods.

Figure 1: GHIST Framework

 

CPC Data Science Hub

This year, we launched an exciting new initiative — the Charles Perkins Centre (CPC) Data Science Hub, led by Professor Jean Yang (Director) and Dr Alistair Senior (Deputy Director) from the Sydney Precision Data Science Centre. The Hub is designed to advance data-intensive research with real-world impact in biomedical, metabolomics, and epidemiological domains, while fostering collaboration and building capacity across the university.

Since becoming fully operational in April 2025, the Hub has already supported 37 spanning multiple faculties and schools. These projects include grant applications and research expected to lead to peer-reviewed publications, reinforcing CPC’s mission and visibility.

We were delighted to welcome Jie Kang as our first team member and are excited that three new data analysts will join early in the new year, further strengthening our capability. With operational systems and resources now in place, the Hub is ready to provide expert consultation, training, and analytics support to CPC researchers.

This initiative marks a major step toward enhancing translational impact and creating a collaborative ecosystem for data-driven discovery.
👉 Visit the Data Science Hub website to learn more. 

 

News and updates

August
•
Our third annual Winter Data Analysis Challenge saw undergraduate students tackle real-world transport data to uncover patterns in Sydney’s travel behaviours. Congratulations to all winners and participants for their innovative insights!
👉 Read the full story online.

October
•  We’re pleased to congratulate Shila Ghazanfar and Pengyi Yang on the publication of their article titled “Multi-task benchmarking of single-cell multimodal omics integration methods” in Nature Methods. Their work introduces a decision-tree tool and an interactive resource to help researchers navigate method selection for complex single-cell datasets.
👉 Read more and find links to the paper on our website.

•  It was exciting to see many of our centre research leaders successfully receive grants for discovery projects. Jean Yang, Ellis Patrick, Jinman Kim, Samuel Muller, Garth Tarr, Rachel Wang, Shila Ghazanfar, Tongliang Liu, and Lei Bi — whose research has been recognised in the latest round of ARC Discovery Projects. These awards underscore the ARC’s commitment to advancing data science as a foundational tool for solving complex challenges in our society. 
👉 Find the full list of their projects and collaborators on our website.

November
•  
We were proud to host the Australian Data Science Network (ADSN) Conference 2025 at the University of Sydney from 10–12 November. 
👉 Read the full wrap-up in our online updates.

 

Seminar series wrap-up

This year’s seminar series brought together leading researchers at various career stages to share cutting-edge advances in computational and statistical approaches to biology. The sessions featured diverse topics—from single-cell omics and CRISPR genome editing to AI-driven drug discovery—fostering collaboration and knowledge exchange across disciplines. The series continues to be a vital forum for connecting experts and supporting early- and mid-career researchers.
👉 Read the full wrap-up online.

 

Australian Data Science Network Conference 2025

The Sydney Precision Data Science Centre was proud to host the Australian Data Science Network (ADSN) Conference 2025 at the University of Sydney from 10–12 November. Over three days, more than 100 attendees came together for keynotes, panels, and lively debates on the future of AI and data science.

Highlights included Prof Bin Yu’s keynote on veridical data science, fast-forward presentations from HDR and ECR researchers, and a fiery LLM Debate asking: Will AI replace scientists? Day 3 focused on industry engagement and applied data science, with sessions on AI adoption in high-risk sectors and federated learning in healthcare.

Thank you to our sponsors: Australian Research Data Commons, Faculty of Science, and the School of Mathematics and Statistics at the University of Sydney.
👉 Catch the full wrap-up on our website. 

 

ECR Update

Chair of the SPDS ECR Committee, Farhan Ameen, reports on the ECR highlights of 2025.

This year has been a big one for our ECRs, with a strong focus on skill-building, centre outreach, and team-building. The peer-led training sessions formed the backbone of this effort. Dr Pratibha Panwar led the Stats Book Club, providing a welcoming space for ECRs to revisit and deepen their understanding of foundational statistical methodologies. Jamie Gabor started the Probability Questions Club, equipping ECRs with the tools and practice to smash through the types of questions found in technical interviews. On the computational side, Jackson Zhou and Martin Huang ran the Coding Club, where ECRs shared practical tips, new methods, and hard-won lessons from their own research.

Outreach was another major highlight. Dr Lijia Yu helped co-organise the annual Australian Data Science Network conference, proudly hosted by our centre this year. A huge thank you to all the ECRs who volunteered to help keep everything running smoothly. Harry Robertson helped kick-start the Digital Research Skills Network at the Westmead Institute for Medical Research, equipping researchers with tools for modern, programmatic biomedical analyses. Rojashree Jayakumar, in her role as the Training and Events Officer at COMBINE, helped organise a bioinformatics hackathon with UNSW BINFSOC, giving undergraduate students an opportunity to get their hands dirty with real-world datasets and problems in bioinformatics.

The ECR committee also hosted Steptember, a month-long, centre-wide step challenge that shifted focus to our own wellbeing. Whether you were a casual stroller or a marathon runner, it was incredible to see everyone’s competitive spirit come alive. During the month, the committee organised a hike along Sydney’s beautiful North Head National Park, where we also spotted whales, swam at the beach, and shared lunch together. Of course, as data scientists, we had to finish the month with some data analysis on our step counts and a few fun superlatives. A big thank you to Daniel Kim, Dr Pratibha Panwar and Dr Beilei Bian for making Steptember a month to remember.

Finally, we had the pleasure of celebrating some key milestones, with PhD graduations for Dr Andy Tran, Dr Carissa Chen, Dr Lijia Yu and thesis submissions from Jackson Zhou, Harry Robertson, Elijah Willie, and soon Daniel Kim and Chuhan Wang. Thank you for making our community such an inclusive, welcoming and importantly a fun place to be. You are all role models for so many of us, and we wish you all the very best in your future endeavours.

 

Celebrating Andy Tran's graduation.

Congratulations Lijia Yu!

Steptember at North Head.

ECR dinner to celebrate Daniel Kim's birthday,

 
Facebook XLinkedIn

Copyright © 2025

The University of Sydney 

Please add data-science.admin@sydney.edu.au to your address book or senders safe list to make sure you continue to see our emails in the future.

Cricos Code 00026A  TEQSA PRV12057
Disclaimer | Privacy statement |

Manage your preferences | Opt Out using TrueRemove™
Got this as a forward? Sign up to receive our future emails.
View this email online.

Carslaw Building None | University of Sydney, 2006 AU

This email was sent to amber.colhoun@sydney.edu.au.
To continue receiving our emails, add us to your address book.