I am currently a Master’s student of Electronic and Information Engineering in Computer Technology at the High-throughput Computer Research Center (HTCRC), part of the Institute of Computing Technology, Chinese Academy of Sciences (ICT, CAS), and the University of Chinese Academy of Sciences (UCAS). I am supervised by Prof. Dongrui Fan and am currently working in the GIMLab, directed by Assoc. Prof. Mingyu Yan, focusing on high-throughput graph neural network (GNN) inference systems.
I obtained my B.Eng. degree in Computer Science and Technology from the School of Information Science and Technology (SIST) at ShanghaiTech University. There, I worked in Toast Lab with Assist. Prof. Chundong Wang on co-optimization of NoSQL databases and file systems. Before this, I gained some experience in the Attitude Research Lab of the School of Entrepreneurship and Management (SEM) at ShanghaiTech University, working with Assoc. Prof. Lifeng Yang on ethics in information technology.
My research interests broadly encompass computer systems and architecture. In addition to my work on GNN inference systems, I am currently involved in the “One Student One Chip” (一生一芯) project, learning the implementation of a RISC-V CPU with various test and optimization techniques. I also have some background in operating systems, having completed the Pintos project as part of the Operating Systems I course at ShanghaiTech, where I later served as a TA during the following academic year.
News
About Me
Education
Time |
School |
2022-09 - 2025-06 |
University of Chinese Academy of Sciences Institute of Computing Technology, Chinese Academy of Sciences |
2018-09 - 2022-06 |
ShanghaiTech University |
Publications
- Haoran Dang, Meng Wu, Mingyu Yan, Xiaochun Ye, Dongrui Fan, “GDL-GNN: Applying GPU Dataloading of Large Datasets for Graph Neural Network Inference”, 30th International European Conference on Parallel and Distributed Computing (Euro-Par 2024), 2024. DOI: 10.1007/978-3-031-69766-1_24. (CCF-B, ISTP, EI)
- Haoran Dang, Chongnan Ye, Yanpeng Hu, Chundong Wang, “NobLSM: an LSM-tree with Non-blocking Writes for SSDs,” 59th ACM/IEEE Design Automation Conference (DAC 2022), 2022. DOI: 10.1145/3489517.3530470. (CCF-A, ISTP, EI, Open Access)
[Full Publication List]
Research Projects
- GDL-GNN: GPU Dataloading for GNN Inference [Paper] [Code]
- Motivation: Graph neural networks (GNNs) suffer from performance bottlenecks due to inefficient data transfer between host and device memory during inference.
- Solution:
- Partition the large graph into subgraphs that can fit in GPU memory. Include features of halo nodes (the neighbors outside the partition) to ensure all the data required for the inference of a subgraph is contained within the subgraph itself, avoiding memory access outside the GPU during inference.
- Use hop-masks to avoid unnecessary computation and storage. Compute only on the necessary nodes of each layer in the GNN model to reduce both memory and time consumption.
- Hide load latency and optimize for multiple GPUs. Overlap the inference of the current subgraph with the loading of the next one. Distribute subgraphs dynamically across GPUs to achieve natural load balancing.
- Result: Significantly reduces inference time while maintaining the accuracy of full-graph inference.
- NobLSM: LSM-tree with Non-blocking Writes [Paper]
- Motivation: NoSQL databases using log-structured merge (LSM) trees often experience performance bottlenecks during the compaction process, with a significant portion of the latency caused by blocking
sync
operations when generating new SSTable files.
- Solution:
- Delay the deletion of old SSTables until the new SSTable is safely stored. Use a
map
to track the relationship between old and new SSTables, keeping the old ones until the new ones are safely stored.
- Utilize the periodic commit of the file-system journal. Since the commit of a journal transaction containing the
inode
occurs after the data blocks are written on the disk, this commit ensures the safe storage of the file.
- Based on the above mechanisms, safely remove the blocking
fsync
/fdatasync
operations during the generation of SSTables in compactions.
- Result: Outperforms state-of-the-art competitors while maintaining the same level of consistency.
Coursework Projects
- The “One Student One Chip” (一生一芯) Project
Student ID: ysyx_24070014 [Learning Record] [Code]
- Software: Implemented NEMU, a RISC-V emulator for teaching purposes, supporting RV32IM instructions up to now.
- Support execution of RV32IM instructions.
- Implement some library functions:
str_
and mem_
functions in string.c
, and sprintf
with %d
, %u
and %s
.
- Use Spike as a reference source for differential testing, comparing register status after execution of every instruction.
- Hardware: Currently developing a single-cycle RISC-V CPU using Verilog.
- Pintos (As the course project of Operating Systems I in ShanghaiTech)
- Implemented a multi-threading kernel with scheduling & thread management, user program & system calls, virtual memory, and file system with buffer cache.
Teaching
Time |
Course |
Role |
Location |
Fall, AY 2021-2022 |
Operating Systems I |
TA |
ShanghaiTech |
Academic Awards
Time |
Award |
Awarded By |
2024-05 |
Outstanding Student (AY 2023-2024) |
UCAS |
2022-06 |
Outstanding Thesis (for undergraduate FYP) |
SIST, ShanghaiTech |
Miscellaneous
Pronunciation of My Name
|
Family Name |
Given Name |
Chinese |
党 |
浩然 |
Pinyin |
Dǎng |
Hàorán |
IPA |
[tɑŋ] |
[xɑʊʐan] |
Based on the course of the same name provided in ShanghaiTech, this book comprehensively discusses the potential ethical and social impact of recent technology advancement in information science.
Topics include ethical principles, AI decision, big data, privacy, digital identity, etc. [Details]
You can buy it at JD.com, or from other bookstores and shopping platforms.