Experimental Lab   





Let data blog   


Scholar profile   





Truyen Tran (PhD)


PraDa, School of Information Technology
Deakin University
Locked Bag 20000
Geelong VIC 3220

truyen.tran {_at_} deakin.edu.au
IMPCA, Department of Computing
Curtin University of Technology
Kent St, Bentley, WA 6102, Australia

Research interests

  • Medical Data Analysis
  • Statistical Machine Learning
  • Probabilistic Graphical Models
  • Others: Recommender Systems, Multimedia, Natural Language Processing, Computer Vision,  Bioinformatics, Social Media 




Research projects


  • CRF-SL: a generic implementation of CRFs for sequential labelling. It supports any data types but requires the raw features extracted from data as the input.
  • viAccent: A Perl-based Accent restoration service for Vietnamese. Use CRFs.
  • viCat: A simple Vietnamese text classifier.




Research Projects

Health Analytics

Modern hospitals and medical centres have collected huge amount of clinical data for hundreds of millions of patients over the past decades. However, how to make the best out of the data for improving clinincal services still remains the major question. This research aims at characterising the data using probabilistic models and applying the state-of-the-art machine learning techniques for representation, clustering and prediction both at the patient and the cohort levels.


  • One-sided filter bank for feature engineering in Electronic Medical Records. Medical records are noisy, episodic and irregular and thus creating great challenges for applying analytical tools. We invented a signal processing style method for extracting comprehensive features for any analysis tasks. [Published in: KDD'13]

  • Patient profiling. Building a unified representation of a patient is extremely useful for individual-level analysis. The challenges are in encoding diverse sources of temporal events. [Published in: PAKDD'13].

  • Suicide risk prediction. This problem is extremely difficult due to rarity of suicide events and weak predictive risk factors. [Published in: KDD'13, KAIS, BMC Psychiatry].

  • Cancer mortality prediction. There are many cancers and the question is should we build a single modelf or all cancers or separate models for individual cancers? [Published in: BMJ Open]

  • Stabilizing clinical prediction models. High-dimensional prediction models are unstable under feature selection and lasso-like shrinkages, and thus less useful because they cannot be reliably estimated from similar sources. [Published in: KAIS, IEEE-JBHI, IWPRHA14].
  • Heart attack readmission prediction. [Published in: Aus Health Review] We show using readily collected electronic medical records would compete well with carefully collected clinical data.
  • Preterm birth prediction. [Under submission]
  • Heart failure readmission prediction. [Under minor revision] Again, eletronic medical records are informative, structurally rich and can be exploited to improve model stability and discrimination.
  • Modelling interventions. [Under submission] We show interventions can be used as patient stratification into groups from which prediction models can be more accurately developed.
  • Web search activities as health indicators. [Published in: WISE14] We show Web search activities are accurate indicators of many social, economic and health issues.
  • Healthcare visual analytics. [Published in: HISABD14] We demonstrate how visual analytics of entire patient history can help doctors make better decision.

Representation learning: Modelling complex and mixed-data types

Raw data may lie on hidden manifolds and contain noise and thus it may not be appropriate for tasks at hand. The goal of representation learning is to discover latent factors in the data which are invariant to small changes and insentitive of noise. These factors then can be fed into standard machine learning techniques. The hope is that the learning curve will be much easier (e.g., better linearity and pre-conditioning) and the final performance will be improved (e.g., due to noise reduction and invariance promotion).

Real-world data are heterogeneous: They come from multiple domains, sources, and are represented in different ways. Put in other ways, they are full of data types: real-valued, binary, ordinal, multiple label, label ranking, preference graph and matrix-variates. Fusion of these types to make informed decision is inherently important. 

In machine learning, data types are often treated separately. In statistics, traditional mixed type handling is often inadequate and does not scale well. Here we explore scalable methods to learn from and infer with multiple data types.


  • RBMs for interpretable discovery of latent factors. [Under submission]

  • Modelling nonnegative data with RBMs. A fast nonlinear alternative to the classic Nonnegative Matrix Factorization. Projection on unseen data is only a single matrix operation. [Published in: ACML'13]

  • Modelling ordinal matrices: these are popular in multiuser ratings of common items, such as those in collaborative filtering. [Published in: UAI'09, AAAI'12, ACML'12]

  • Probabilistic models of ordered partitions: we address ranking of subsets instead of individuals, where the subsets are themselves unknown in advance. This leads to an explosion in state-space. Here we introduce high-order Markov chain over partitions and MCMC methods for learning from ordered partitionings. [Published in SDM'11, ACML'12]

  • Modelling mixed-data types using RBM: intergrating multiple data types can be a non-trivial task for many real-world problems. Here we offer an unified way for converting mixed-data types into the numerical vectorial spaces. [Publised in ACML'11]

  • Thurstonian Boltzmann Machines: A multivariate utility model for mixed-data types. For better intepretation, here we model the process of data generation for multiple types using random utilities, and at the same time enable information fusion and discovery of latent data aspects. [Published in FUSION'12, ICML'13].

  • Learning probabilistic metrics on learned representation. [Published in ICME'12, ICME'13]

  • Embedding medical objects. [Under submission]

Statistical Relational Learning

Much of real-world data is relational. While this offers stronger statistical properties through compactness, it also proves to be very challenging due to the complextiy and scales. Problems of interest include: feature aggregation, concept/predicate invention, clustering and latent structures discovery, collective classification/regression, exploiting symmetries, large-scale learning and inference.


  • Entities modelling from dyadic relations: [Published in AusDM'07].

  • Discovering latent profiles from dyadic relations. [Published in UAI'09, ACML'12].

Recommender systems

The goal is to deliver right services to right users. Most existing work is rather ad-hoc and ignores complex nature of the data. Research topics include discovering hidden patterns, incorporating contexts and side-information, social networks, multiple-domains, product hierarchies, as well as correlations between actors and items.


  • Preference networks: integrating user profile, item content and ratings into a single probabilistic database using Relational Markov Network. Supports rating prediction, item-ranking (given an user) and user-ranking (given an item). [Published in AusDM'07].

  • Ordinal Boltzmann machines: treating rating matrix as a whole unit and discovering hidden aspects of the data. Also includes treatment of ordinal nature of the data, and Preference networks as a subcomponent. [Published in UAI'09, ACML'12].

  • Sequential ordinal matrix factorisation: investigating the generative mechanism of the rating matrix. We adopt the notion that an ordinal output is a result of a sequential process: Given that an item has a utility with respect to an user, we start from the lowest level, and stop at the optimal level where the utility falls below its threshold. The utility is a combination of different aspects: the general value of the item, the general easiness of the user, the compatibility between the user and the item, and the relations between the item and other items. [Published in AAAI'12].

  • Collaborative ranking: addressing the mismatch bewteen the data representation (ordinal ratings) and the recommendation goal (rank list of items or list of subsets), we embrace the notion of learning to rank collaboratively, borrowing ideas from learning to rank in information retrieval and mixed-effect models in statistical fields. [Published in SDM'11, AMCL'12].

  • Learning preferences over sets. moving beyond the current paradigm on preference over items, here we explore the space of sets directly. [Published in SDM'11, AMCL'12].

  • Learning to recommend medical objects.

Discriminative sequential models

This is indeed very rich type of data which we encounter everyday. Issues include feature discovery, segmentation, permutation, collocation and n-grams. One the the extreme is the problem of statistical machine translation, where the output space is theoretically infinite (the target language space).

  • Learning local and global models with partial labelling schemes [PRICAI'08].

  • Feature selection with exponential loss under partial labelling scheme [ISSNIP'05].

  • Learning a highly constrained sequential model.  [PRICAI'08]

  • Nested sequential models. [Publication: NIPS'08, DLSRRA'09]


Fast learning and inference in intractable graphical models

Learning in generic graphical models is needed for both network structure and parameter. However, structure learning is extremely difficult because the structure space is exponentially large. Even when the structure is known, parameter learning is also very challenging since quantities needed for learning are often intractable to estimate. For parameter learning, fast methods like pseudo-likelihood are sub-optimal with finite data and often over-estimate the interaction between variables. Message-passing methods can be used for the inner inference loops but their poor convergence guarantee may stop learning progress too early.

The goal of this project is to investigate options for fast and effective learning and inference.


  • AdaBoost.MRF - learning to select and weight trees from a tree ensemble: not only this supports parameter learning, it also discovers strong interactions among variables, and thus enables structure learning. [CVPR'06]

Activity recognition

Activity recognition is important in assisting tasks or surveillance. Several problems need to be addressed: First,  sensory information is often unreliable and there must be a way to fuse many sensor readings to make informed decision. Second, activity can be quite complex, consisting of many sub-activities. And third, training labels are often sparse since it would be incovenient for users to provide labels frequently.


  • Selection of sensory inputs for structured activities under missing labels. we offer a fast beam search to select very small number of sensor inputs even when the labels are partially missing.  [Published in ISSNIP'05].

  • Conditional models for daily routine labelling under sparse labels: we investigate conditional models (Maximum Entropy Markov Models and Conditional Random Fields) when labels are sparse. [Published in PRICAI'08].

  • Joint modelling of multilevel activity abstraction: often activities are performed to achieve some more abstract goals. Here we propose a general framework to model multiple levels of activity abstraction. [Published in CVPR'06].

  • Hierarchical nested activities: when activities are nested in time (higher level of activity is declared completed only if their child activities have been terminated), learning and inference can be performed in polynomial time. [Published in NIPS'08].

Lexical disambiguation


  • Accent restoration in Vietnamese: Given an accentless sentence, we need to restore the lost accents (or tonal marks) [Publication; PRICAI'08] [Demo].









  • Boosted Markov networks for activity recognition, Truyen Tran, Hung Hai Bui and Svetha Venkatesh, In Proc. of International Conference on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP2005), 5-8 Dec, Melbourne, Australia.
  • Global optimization using Levy flight, Truyen Tran, Trung Thanh Nguyen, Hoang Linh Nguyen, Second National Symposium on Research, Development and Application of Information and Communication Technology (ICT. rda), Vietnam, 2004

Working papers

Tutorials and Notes

PhD thesis



  • Extracting medical features for risk prediction, Truyen Tran, Quoc-Dinh Phung, Wei Luo, and Svetha Venkatesh, Provisional patent, June 2013.