Hi! I'm Chang,

a PhD student at Syracuse University, advised by Kristopher Micinski.

My research sits at the intersection of Programming Languages, Machine Learning, and Security, with a focus on reverse engineering: taking binaries and recovering something a human can read and reason about.

My current work is a prototype Datalog C decompiler that treats decompilation the way modern compilers treat compilation: as a chain of small, logic-defined passes over a shared fact store, keeping ambiguous interpretations as evidence rather than committing early to one. It’s implemented in 35K lines of Rust and Datalog, lifts Linux ELF binaries to C99.

Previously, I built data infrastructure for binary analysis. Assemblage is a distributed build system and a family of labeled binary datasets produced by compiling open-source projects at scale. It appeared at NeurIPS 2024, and the datasets are widely used across the field.

I’m actively looking for a 2026 summer internship, please let me know if your team has a fit! If you’re at Google, I’m currently in the team match phase for a PhD SWE Intern role for Summer 2026.

Publications

  1. Superset Decompilation
    Chang Liu, Yihao Sun, Thomas Gilray, Kristopher Micinski
    arXiv:2603.28002, 2026
  2. Assemblage: Automatic Binary Dataset Construction for Machine Learning
    Chang Liu*, Rebecca Saul*, Yihao Sun, Edward Raff, et al.
    NeurIPS 2024, Datasets & Benchmarks Track
  3. Is Function Similarity Over-Engineered? Building a Benchmark
    Rebecca Saul, Chang Liu, Noah Fleischmann, Richard J Zak, et al.
    NeurIPS 2024, Datasets & Benchmarks Track
  4. ASSEMBLAGE-DEEPHISTORY: A Cross-Build Binary Dataset with Temporal Coverage
    Chang Liu, Noah Fleischmann, Nicolò Altamura, Edward Raff, et al.
    2026

Services

Reviewer: NeurIPS, AAAI AICS Workshop