会议日程

2024-06-15 (Day 1):[全程回放]
Time Sessions Talks
8:45 - 8:55 开幕介绍 [演讲视频]
8:55 - 9:00 领导致辞
9:00 - 9:40 Keynote Session #1
Session Chair:
何水兵(浙江大学)
提高ZNS固态盘性能的方法
冯丹(华中科技大学)[演讲视频]
9:40 - 10:20 智海系列垂直领域大模型与人工智能体
吴飞(浙江大学)[演讲视频][PPT下载]
10:20 - 10:30 茶歇
10:30 - 11:10 Keynote 方天视窗高并行统一渲染架构
杨程云(华为)[演讲视频]
11:10 - 12:00 Lightning Talk #1
Session Chair:
陈爽(华为)
Lightning Talk[演讲视频][PPT下载]
12:00 - 13:30 午饭+Poster Session
13:30 - 13:50 Oral Session #1
Operating System
Session Chair:
古金宇(上海交通大学)
An Empirical Study of Rust-for-Linux: The Success, Dissatisfaction, and Compromise
李弘宇(北京邮电大学)[演讲视频][PPT下载]
13:50 - 14:10 PathFuzz: Broadening Fuzzing Horizons with Footprint Memory for CPUs
徐易难(中国科学院计算技术研究所)[演讲视频][PPT下载]
14:10 - 14:30 On-the-fly Quarantine Before Patches for N-day Kernel Vulnerabilities Are Available
戴钦润(科罗拉多大学博尔德分校)[演讲视频][PPT下载]
14:30 - 14:50 Flexible, Secure and Efficient CVM Maintenance with Confidential Procedure Calls
陈家浩 (上海交通大学)[演讲视频][PPT下载]
14:50 - 15:10 Taming Hot Bloat Under Virtualization with HugeScope
李传东(北京大学)[演讲视频][PPT下载]
15:10 - 15:20 茶歇
15:20 - 15:40 Oral Session #2
MLSys + GPU
Session Chair:
郑鹏飞(华为)
CMC: Video Transformer Accelerator with CODEC Assisted Matrix Condensing
宋卓然(上海交通大学)[演讲视频][PPT下载]
15:40 - 16:00 MagPy: Effective Operator Graph Instantiation for Deep Learning by Execution State Monitoring
张晨(清华大学)[演讲视频][PPT下载]
16:00 - 16:20 Soter: Analytical Tensor-Architecture Modeling and Automatic Tensor Program Tuning for Spatial Accelerators
王福宇(中山大学)[演讲视频][PPT下载]
16:20 - 16:40 Removing Obstacles before Breaking Through the Memory Wall: A Close Look at HBM Errors in the Field
吴榕龙(厦门大学)[演讲视频][PPT下载]
16:40 - 16:50 茶歇
16:50 - 17:10 Industry Session
Session Chair:
徐尔茨(PDL)
图计算的大规模工业化应用实践与挑战
洪春涛(蚂蚁集团)[演讲视频][PPT下载]
17:10 - 17:30 构建 AI 2.0 时代的万卡集群:零一万物 AI Infra 建设实践
谢文(零一万物)[演讲视频]
17:30 - 17:50 大模型时代的AI系统:挑战与展望
王喆锋(华为云)[演讲视频]
17:50 - 18:10 Q & A[演讲视频]
18:10 晚宴

2024-06-16 (Day 2):[全程回放]
Time Sessions Talks
9:00 - 9:40 Keynote Session #2
Session Chair:
毛波(厦门大学)
“通用”的类脑计算系统软硬件研究
张悠慧(清华大学)[演讲视频]
9:40 - 10:20 面向多核处理器的矩阵计算优化
董德尊(国防科技大学)[演讲视频][PPT下载]
10:20 - 10:30 茶歇
10:30 - 10:50 Best Paper Session
Session Chair:
魏星达(上海交通大学)
Centauri: Enabling Efficient Scheduling for Communication-Computation Overlap in Large Model Training via Communication Partitioning
陈畅(北京大学)[演讲视频][PPT下载]
10:50 - 11:10 What’s the Story in EBS Glory: Evolutions and Lessons in Building Cloud Block Store
张伟东(阿里巴巴)[演讲视频][PPT下载]
11:10 - 11:30 Towards a Shared-storage-based Serverless Database Achieving Seamless Scale-up and Read Scale-out
陈浩(阿里巴巴)[演讲视频][PPT下载]
11:30 - 12:00 Lightning Talk #2
Session Chair:
汪睿(浙江大学)
Lightning Talk[演讲视频][PPT下载]
12:00 - 13:30 午饭+Poster Session
13:30 - 13:50 Oral Session #3
Architecture
Session Chair:
卢丽强(浙江大学)
StreamPIM: Streaming Matrix Computation in Racetrack Memory
安昱达(北京大学)[演讲视频][PPT下载]
13:50 - 14:10 UM-PIM: DRAM-based PIM with Uniform & Shared Memory Space
刘方鑫(上海交通大学)[演讲视频][PPT下载]
14:10 - 14:30 AVM-BTB: Adaptive and Virtualized Multi-level Branch Target Buffer
刘蕴哲(中国科学院计算所)[演讲视频][PPT下载]
14:30 - 14:50 An Instruction Inflation Analyzing Framework for Dynamic Binary Translators
谢本壹(中国科学院计算所)[演讲视频][PPT下载]
14:50 - 15:00 茶歇
15:00 - 15:20 Oral Session #4
Cloud Computing
Session Chair:
付森波(华为)
Harmonizing Efficiency and Practicability: Optimizing Resource Utilization in Serverless Computing with Jiagu
柳清源(上海交通大学)[演讲视频]
15:20 - 15:40 UFO: The Ultimate QoS-Aware CPU Core Management for Virtualized and Oversubscribed Public
彭雅娟 (中国科学院深圳先进技术研究院)[演讲视频][PPT下载]
15:40 - 16:00 AND: Application-network Diagnosing System for Millions of IPs in Production Clouds
康鑫磊(阿里巴巴)[演讲视频][PPT下载]
16:00 - 16:20 Improving Resource and Energy Efficiency for Cloud 3D through Excessive Rendering Reduction
刘天义 (得克萨斯大学圣安东尼奥分校)[演讲视频][PPT下载]
16:20 - 16:30 茶歇
16:30 - 16:50 Oral Session #5
Storage
Session Chair:
张杰(北京大学)
Designing an Efficient Data Deduplication Scheme for File-Based Encrypted Mobile Systems
黄辉(重庆大学)[演讲视频]
16:50 - 17:10 Ethane: An Asymmetric File System for Disaggregated Persistent Memory
蔡淼(南京航空航天大学)[演讲视频]
17:10 - 17:30 TeRM: Extending RDMA-Attached Memory with SSD
杨者(清华大学)[演讲视频][PPT下载]
17:30 - 17:50 Sync+Sync: A Covert Channel Built on fsync with Persistent Storage
王春东(上海科技大学)[演讲视频][PPT下载]
17:50 - 18:00 总结[演讲视频]
18:00 晚餐


Poster日程安排

2024-06-15 (Day 1):
Session ID Paper Author
0-Operating System 1 Adaptive Memory Swapping to Improve User Experience on Mobile Devices 李文通(华东师范大学)
2 Detecting Smart Home Automation Application Interferences with Domain Knowledge 汪涛(中国科学院软件研究所)
3 Efficient Maximal Biclique Enumeration on GPUs 潘哲(浙江大学)
4 GraalVM as a generic runtime for FOSS EDA 李枫(独立开发者)
5 HydraRPC: RPC in the CXL Era 马腾(阿里巴巴集团)
6 Live Migration of Virtual Machines Based on Dirty Page Similarity 程延博(兰州大学)
7 Quantized Data Transmission Optimization for Distributed GMRES Algorithm 高建花(北京师范大学)
8 SandTable: Scalable Distributed System Model Checking with Specification-Level State Exploration 唐瑞泽(南京大学)
9 TCSA: Efficient Localization of Busy-Wait Synchronization Bugs for Latency-Critical Applications 李宁(华东师范大学)
10 Userspace Bypass: Accelerating Syscall-intensive Applications 周喆(复旦大学)
11 面向容器集群的网络入侵检测系统 张良康(华中科技大学)
1-MLSys+GPU 12 Aceso: Efficient Parallel DNN Training through Iterative Bottleneck Alleviation 刘国栋(中国科学院计算技术研究所)
13 GNNavigator: Towards Adaptive Training of Graph Neural Networks via Automatic Guideline Exploration 乔同(北京航空航天大学)
14 Graph Neural Networks Automated Design and Deployment on Device-Edge Co-Inference Systems 周傲(北京航空航天大学)
15 INSPIRE: Accelerating Deep Neural Networks via Hardware-friendly Index-Pair Encoding 汪宗武(上海交通大学)
16 MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs 陈扬锐(字节跳动)
17 Parcae: Proactive, Liveput-Optimized DNN Training on Preemptible Instances 段江飞(香港中文大学)
18 SiBrain: A Sparse Spatio-temporal Parallel Neuromorphic Architecture for Accelerating Spiking Convolution Neural Networks with Low Latency 崔友锋(广东工业大学)
19 SpecFL: An Efficient Speculative Federated Learning System for Tree-based Model Training 张玉会(中国科学院信息工程研究所)
20 Efficient SpMM Accelerator for Deep Learning: Sparkle and Its Automated Generator 姜晶菲(中国人民解放军国防科技大学)
21 A Holistic Functionalization Approach to Optimizing Imperative Tensor Programs in Deep Learning 麻津铭(上海人工智能实验室)
2-Architecture 22 AIG-CIM: A Scalable Chiplet Module with Tri-Gear Heterogeneous Compute-in-Memory for Diffusion Acceleration 孙奕扬(北京大学)
23 Alchemist: A Unified Accelerator Architecture for Cross-Scheme Fully Homomorphic Encryption 穆嘉楠(中国科学院计算技术研究所)
24 Cuper: Customized Dataflow and Perceptual Decoding for Sparse Matrix-Vector Multiplication on HBM-Equipped FPGAs 伊恩鑫(中国石油大学(北京))
25 Efficient Cross-platform Multiplexing of Hardware Performance Counters via Adaptive Grouping 刘通宇(华东师范大学)
26 NDPBridge: Enabling Cross-Bank Coordination in Near-DRAM-Bank Processing Architectures 田博宇(清华大学)
27 SMG: A System-level Modality Gating Facility for Fast and Energy-Efficient Multimodal Computing 侯小凤(上海交通大学)
28 ReCG: ReRAM-Accelerated Sparse Conjugate Gradient 范明嘉(中国石油大学(北京))
29 SegScope: Probing Fine-grained Interrupts via Architectural Footprints 张鑫(北京大学)
30 QuFEM: Fast and Accurate Quantum Readout Calibration Using the Finite Element Method 张涵禹(浙江大学)
31 SpREM: Exploiting Hamming Sparsity for Fast Quantum Readout Error Mitigation 张涵禹(浙江大学)
32 Tyche: An Efficient and General Prefetcher for Indirect Memory Accesses 薛峰(中国科学院计算技术研究所)
33 Optimization of current DMA operation with allocation and mapping 朱彦军(Intel/IONOS)

2024-06-16 (Day 2):
Session ID Paper Author
3-Cloud Computing 1 A Tale of Two Paths: Toward a Hybrid Data Plane for Efficient Far-Memory Applications 陈磊(中国科学院计算技术研究所)
2 Flagger: Near-Data Acceleration for Large-Scale Cross-Silo Federated Learning Aggregation 张杰(北京大学)
3 FUYAO: DPU-enabled Direct Data Transfer for Serverless Computing 刘国威(天津大学)
4 Rethinking an eBPF-based lightweight and unified solution for Edge networking 李枫(独立开发者)
4-Storage 5 A Write-Optimized PM-oriented B+-tree with Aligned Flush and Selective Migration 李明杰(贵州大学)
6 Boosting File Systems Elegantly: a Case for a Transparent NVM Page Cache 王国毓(吉林大学)
7 CCL-BTree: A Crash-Consistent Locality-Aware B+-Tree for Reducing XPBuffer-Induced Write Amplification in Persistent Memory 李振鑫(浙江大学)
8 Detecting Metadata-Related Logic Bugs in Database Systems via Raw Database Construction 宋建森(中国科学院软件研究所)
9 Differential Optimization Testing of Gremlin-Based Graph Database Systems 郑莹莹(中国科学院软件研究所)
10 Efficient Large Graph Processing with Chunk-Based Graph Representation Model 宗威旭(浙江大学)
11 Exploit both SMART Attributes and NAND Flash Wear Characteristics to Effectively Forecast SSD-based Storage Failures in Clusters 谷云飞(上海交通大学)
12 Fast and Scalable In-network Lock Management Using Lock Fission 张汉泽(上海交通大学)
13 HADB: Hotness-Aware Key-Value Store with Persistent Memory 谭蕴麟(贵州大学)
14 Hardware-Software Co-Designs of User-Space All-Flash Array Engine 张杰(北京大学)
15 Heet: Accelerating Elastic Training in Heterogeneous Deep Learning Clusters 莫梓钊(澳门大学)
16 Improving Graph Compression for Efficient Resource-Constrained Graph Analytics 许骞(中国人民大学)
17 OmniCache: Collaborative Caching for Near-storage Accelerators 张坚(罗格斯大学)
18 PolarDB-SCC: A Cloud-Native Database Ensuring Low Latency for Strongly Consistent Reads 陈浩(阿里云)
19 SMART: A High-Performance Adaptive Radix Tree for Disaggregated Memory 罗旭川(复旦大学)
20 SODA: A Set of Fast Oblivious Algorithms in Distributed Secure Data Analytics 李想(清华大学)
21 Understanding Transaction Bugs in Database Systems 崔紫玉(中国科学院软件研究所)