会议日程
2024-06-15 (Day 1):[全程回放] | |||
Time | Sessions | Talks | |
8:45 - 8:55 | 开幕介绍 | [演讲视频] | |
8:55 - 9:00 | 领导致辞 | ||
9:00 - 9:40 | Keynote Session #1 Session Chair: 何水兵(浙江大学) |
提高ZNS固态盘性能的方法 冯丹(华中科技大学)[演讲视频] |
|
9:40 - 10:20 | 智海系列垂直领域大模型与人工智能体 吴飞(浙江大学)[演讲视频][PPT下载] |
||
10:20 - 10:30 | 茶歇 | ||
10:30 - 11:10 | Keynote | 方天视窗高并行统一渲染架构 杨程云(华为)[演讲视频] |
|
11:10 - 12:00 | Lightning Talk #1 Session Chair: 陈爽(华为) |
Lightning Talk[演讲视频][PPT下载] | |
12:00 - 13:30 | 午饭+Poster Session | ||
13:30 - 13:50 | Oral Session #1 Operating System Session Chair: 古金宇(上海交通大学) |
An Empirical Study of Rust-for-Linux: The Success, Dissatisfaction, and Compromise 李弘宇(北京邮电大学)[演讲视频][PPT下载] |
|
13:50 - 14:10 | PathFuzz: Broadening Fuzzing Horizons with Footprint Memory for CPUs 徐易难(中国科学院计算技术研究所)[演讲视频][PPT下载] |
||
14:10 - 14:30 | On-the-fly Quarantine Before Patches for N-day Kernel Vulnerabilities Are Available 戴钦润(科罗拉多大学博尔德分校)[演讲视频][PPT下载] |
||
14:30 - 14:50 | Flexible, Secure and Efficient CVM Maintenance with Confidential Procedure Calls 陈家浩 (上海交通大学)[演讲视频][PPT下载] |
||
14:50 - 15:10 | Taming Hot Bloat Under Virtualization with HugeScope 李传东(北京大学)[演讲视频][PPT下载] |
||
15:10 - 15:20 | 茶歇 | ||
15:20 - 15:40 | Oral Session #2 MLSys + GPU Session Chair: 郑鹏飞(华为) |
CMC: Video Transformer Accelerator with CODEC Assisted Matrix Condensing 宋卓然(上海交通大学)[演讲视频][PPT下载] |
|
15:40 - 16:00 | MagPy: Effective Operator Graph Instantiation for Deep Learning by Execution State Monitoring 张晨(清华大学)[演讲视频][PPT下载] |
||
16:00 - 16:20 | Soter: Analytical Tensor-Architecture Modeling and Automatic Tensor Program Tuning for Spatial Accelerators 王福宇(中山大学)[演讲视频][PPT下载] |
||
16:20 - 16:40 | Removing Obstacles before Breaking Through the Memory Wall: A Close Look at HBM Errors in the Field 吴榕龙(厦门大学)[演讲视频][PPT下载] |
||
16:40 - 16:50 | 茶歇 | ||
16:50 - 17:10 | Industry Session Session Chair: 徐尔茨(PDL) |
图计算的大规模工业化应用实践与挑战 洪春涛(蚂蚁集团)[演讲视频][PPT下载] |
|
17:10 - 17:30 | 构建 AI 2.0 时代的万卡集群:零一万物 AI Infra 建设实践 谢文(零一万物)[演讲视频] |
||
17:30 - 17:50 | 大模型时代的AI系统:挑战与展望 王喆锋(华为云)[演讲视频] |
||
17:50 - 18:10 | Q & A[演讲视频] | ||
18:10 | 晚宴 |
2024-06-16 (Day 2):[全程回放] | |||
Time | Sessions | Talks | |
9:00 - 9:40 | Keynote Session #2 Session Chair: 毛波(厦门大学) |
“通用”的类脑计算系统软硬件研究 张悠慧(清华大学)[演讲视频] |
|
9:40 - 10:20 | 面向多核处理器的矩阵计算优化 董德尊(国防科技大学)[演讲视频][PPT下载] |
||
10:20 - 10:30 | 茶歇 | ||
10:30 - 10:50 | Best Paper Session Session Chair: 魏星达(上海交通大学) |
Centauri: Enabling Efficient Scheduling for Communication-Computation Overlap in Large Model Training via Communication Partitioning 陈畅(北京大学)[演讲视频][PPT下载] |
|
10:50 - 11:10 | What’s the Story in EBS Glory: Evolutions and Lessons in Building Cloud Block Store 张伟东(阿里巴巴)[演讲视频][PPT下载] |
||
11:10 - 11:30 | Towards a Shared-storage-based Serverless Database Achieving Seamless Scale-up and Read Scale-out 陈浩(阿里巴巴)[演讲视频][PPT下载] |
||
11:30 - 12:00 | Lightning Talk #2 Session Chair: 汪睿(浙江大学) |
Lightning Talk[演讲视频][PPT下载] | |
12:00 - 13:30 | 午饭+Poster Session | ||
13:30 - 13:50 | Oral Session #3 Architecture Session Chair: 卢丽强(浙江大学) |
StreamPIM: Streaming Matrix Computation in Racetrack Memory 安昱达(北京大学)[演讲视频][PPT下载] |
|
13:50 - 14:10 | UM-PIM: DRAM-based PIM with Uniform & Shared Memory Space 刘方鑫(上海交通大学)[演讲视频][PPT下载] |
||
14:10 - 14:30 | AVM-BTB: Adaptive and Virtualized Multi-level Branch Target Buffer 刘蕴哲(中国科学院计算所)[演讲视频][PPT下载] |
||
14:30 - 14:50 | An Instruction Inflation Analyzing Framework for Dynamic Binary Translators 谢本壹(中国科学院计算所)[演讲视频][PPT下载] |
||
14:50 - 15:00 | 茶歇 | ||
15:00 - 15:20 | Oral Session #4 Cloud Computing Session Chair: 付森波(华为) |
Harmonizing Efficiency and Practicability: Optimizing Resource Utilization in Serverless Computing with Jiagu 柳清源(上海交通大学)[演讲视频] |
|
15:20 - 15:40 | UFO: The Ultimate QoS-Aware CPU Core Management for Virtualized and Oversubscribed Public 彭雅娟 (中国科学院深圳先进技术研究院)[演讲视频][PPT下载] |
||
15:40 - 16:00 | AND: Application-network Diagnosing System for Millions of IPs in Production Clouds 康鑫磊(阿里巴巴)[演讲视频][PPT下载] |
||
16:00 - 16:20 | Improving Resource and Energy Efficiency for Cloud 3D through Excessive Rendering Reduction 刘天义 (得克萨斯大学圣安东尼奥分校)[演讲视频][PPT下载] |
||
16:20 - 16:30 | 茶歇 | ||
16:30 - 16:50 | Oral Session #5 Storage Session Chair: 张杰(北京大学) |
Designing an Efficient Data Deduplication Scheme for File-Based Encrypted Mobile Systems 黄辉(重庆大学)[演讲视频] |
|
16:50 - 17:10 | Ethane: An Asymmetric File System for Disaggregated Persistent Memory 蔡淼(南京航空航天大学)[演讲视频] |
||
17:10 - 17:30 | TeRM: Extending RDMA-Attached Memory with SSD 杨者(清华大学)[演讲视频][PPT下载] |
||
17:30 - 17:50 | Sync+Sync: A Covert Channel Built on fsync with Persistent Storage 王春东(上海科技大学)[演讲视频][PPT下载] |
||
17:50 - 18:00 | 总结[演讲视频] | ||
18:00 | 晚餐 |
Poster日程安排
2024-06-15 (Day 1): | |||
Session | ID | Paper | Author |
0-Operating System | 1 | Adaptive Memory Swapping to Improve User Experience on Mobile Devices | 李文通(华东师范大学) |
2 | Detecting Smart Home Automation Application Interferences with Domain Knowledge | 汪涛(中国科学院软件研究所) | |
3 | Efficient Maximal Biclique Enumeration on GPUs | 潘哲(浙江大学) | |
4 | GraalVM as a generic runtime for FOSS EDA | 李枫(独立开发者) | |
5 | HydraRPC: RPC in the CXL Era | 马腾(阿里巴巴集团) | |
6 | Live Migration of Virtual Machines Based on Dirty Page Similarity | 程延博(兰州大学) | |
7 | Quantized Data Transmission Optimization for Distributed GMRES Algorithm | 高建花(北京师范大学) | |
8 | SandTable: Scalable Distributed System Model Checking with Specification-Level State Exploration | 唐瑞泽(南京大学) | |
9 | TCSA: Efficient Localization of Busy-Wait Synchronization Bugs for Latency-Critical Applications | 李宁(华东师范大学) | |
10 | Userspace Bypass: Accelerating Syscall-intensive Applications | 周喆(复旦大学) | |
11 | 面向容器集群的网络入侵检测系统 | 张良康(华中科技大学) | |
1-MLSys+GPU | 12 | Aceso: Efficient Parallel DNN Training through Iterative Bottleneck Alleviation | 刘国栋(中国科学院计算技术研究所) |
13 | GNNavigator: Towards Adaptive Training of Graph Neural Networks via Automatic Guideline Exploration | 乔同(北京航空航天大学) | |
14 | Graph Neural Networks Automated Design and Deployment on Device-Edge Co-Inference Systems | 周傲(北京航空航天大学) | |
15 | INSPIRE: Accelerating Deep Neural Networks via Hardware-friendly Index-Pair Encoding | 汪宗武(上海交通大学) | |
16 | MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs | 陈扬锐(字节跳动) | |
17 | Parcae: Proactive, Liveput-Optimized DNN Training on Preemptible Instances | 段江飞(香港中文大学) | |
18 | SiBrain: A Sparse Spatio-temporal Parallel Neuromorphic Architecture for Accelerating Spiking Convolution Neural Networks with Low Latency | 崔友锋(广东工业大学) | |
19 | SpecFL: An Efficient Speculative Federated Learning System for Tree-based Model Training | 张玉会(中国科学院信息工程研究所) | |
20 | Efficient SpMM Accelerator for Deep Learning: Sparkle and Its Automated Generator | 姜晶菲(中国人民解放军国防科技大学) | |
21 | A Holistic Functionalization Approach to Optimizing Imperative Tensor Programs in Deep Learning | 麻津铭(上海人工智能实验室) | |
2-Architecture | 22 | AIG-CIM: A Scalable Chiplet Module with Tri-Gear Heterogeneous Compute-in-Memory for Diffusion Acceleration | 孙奕扬(北京大学) |
23 | Alchemist: A Unified Accelerator Architecture for Cross-Scheme Fully Homomorphic Encryption | 穆嘉楠(中国科学院计算技术研究所) | |
24 | Cuper: Customized Dataflow and Perceptual Decoding for Sparse Matrix-Vector Multiplication on HBM-Equipped FPGAs | 伊恩鑫(中国石油大学(北京)) | |
25 | Efficient Cross-platform Multiplexing of Hardware Performance Counters via Adaptive Grouping | 刘通宇(华东师范大学) | |
26 | NDPBridge: Enabling Cross-Bank Coordination in Near-DRAM-Bank Processing Architectures | 田博宇(清华大学) | |
27 | SMG: A System-level Modality Gating Facility for Fast and Energy-Efficient Multimodal Computing | 侯小凤(上海交通大学) | |
28 | ReCG: ReRAM-Accelerated Sparse Conjugate Gradient | 范明嘉(中国石油大学(北京)) | |
29 | SegScope: Probing Fine-grained Interrupts via Architectural Footprints | 张鑫(北京大学) | |
30 | QuFEM: Fast and Accurate Quantum Readout Calibration Using the Finite Element Method | 张涵禹(浙江大学) | |
31 | SpREM: Exploiting Hamming Sparsity for Fast Quantum Readout Error Mitigation | 张涵禹(浙江大学) | |
32 | Tyche: An Efficient and General Prefetcher for Indirect Memory Accesses | 薛峰(中国科学院计算技术研究所) | |
33 | Optimization of current DMA operation with allocation and mapping | 朱彦军(Intel/IONOS) |
2024-06-16 (Day 2): | |||
Session | ID | Paper | Author |
3-Cloud Computing | 1 | A Tale of Two Paths: Toward a Hybrid Data Plane for Efficient Far-Memory Applications | 陈磊(中国科学院计算技术研究所) |
2 | Flagger: Near-Data Acceleration for Large-Scale Cross-Silo Federated Learning Aggregation | 张杰(北京大学) | |
3 | FUYAO: DPU-enabled Direct Data Transfer for Serverless Computing | 刘国威(天津大学) | |
4 | Rethinking an eBPF-based lightweight and unified solution for Edge networking | 李枫(独立开发者) | |
4-Storage | 5 | A Write-Optimized PM-oriented B+-tree with Aligned Flush and Selective Migration | 李明杰(贵州大学) |
6 | Boosting File Systems Elegantly: a Case for a Transparent NVM Page Cache | 王国毓(吉林大学) | |
7 | CCL-BTree: A Crash-Consistent Locality-Aware B+-Tree for Reducing XPBuffer-Induced Write Amplification in Persistent Memory | 李振鑫(浙江大学) | |
8 | Detecting Metadata-Related Logic Bugs in Database Systems via Raw Database Construction | 宋建森(中国科学院软件研究所) | |
9 | Differential Optimization Testing of Gremlin-Based Graph Database Systems | 郑莹莹(中国科学院软件研究所) | |
10 | Efficient Large Graph Processing with Chunk-Based Graph Representation Model | 宗威旭(浙江大学) | |
11 | Exploit both SMART Attributes and NAND Flash Wear Characteristics to Effectively Forecast SSD-based Storage Failures in Clusters | 谷云飞(上海交通大学) | |
12 | Fast and Scalable In-network Lock Management Using Lock Fission | 张汉泽(上海交通大学) | |
13 | HADB: Hotness-Aware Key-Value Store with Persistent Memory | 谭蕴麟(贵州大学) | |
14 | Hardware-Software Co-Designs of User-Space All-Flash Array Engine | 张杰(北京大学) | |
15 | Heet: Accelerating Elastic Training in Heterogeneous Deep Learning Clusters | 莫梓钊(澳门大学) | |
16 | Improving Graph Compression for Efficient Resource-Constrained Graph Analytics | 许骞(中国人民大学) | |
17 | OmniCache: Collaborative Caching for Near-storage Accelerators | 张坚(罗格斯大学) | |
18 | PolarDB-SCC: A Cloud-Native Database Ensuring Low Latency for Strongly Consistent Reads | 陈浩(阿里云) | |
19 | SMART: A High-Performance Adaptive Radix Tree for Disaggregated Memory | 罗旭川(复旦大学) | |
20 | SODA: A Set of Fast Oblivious Algorithms in Distributed Secure Data Analytics | 李想(清华大学) | |
21 | Understanding Transaction Bugs in Database Systems | 崔紫玉(中国科学院软件研究所) |