Nils Blach

My research interests are centered around the fields of High-Performance Computing and Systems.

I currently work at Anthropic, an AI safety and research company based in San Francisco.

Previously, I was fortunate to work with and be mentored by Dr. Maciej Besta and Prof. Torsten Hoefler as part of the Scalable Parallel Computing Laboratory (SPCL) at ETH Zurich.


Highlights

Graph of Thoughts

GoT Scheme

Graph of Thoughts is our novel framework that advances prompting capabilities in large language models beyond paradigms such as Chain-of-Thought or Tree of Thoughts.

By modelling the information generated by an LLM as an arbitrary graph, we achieve significant advantages over state of the art on different tasks, for example increasing the quality of sorting by 62% over ToT, while simultaneously reducing costs by >31%.

We ensure that GoT is fully extensible with new thought transformations and thus can be used to spearhead new prompting schemes - check the code on GitHub!

A High-Performance Design, Implementation, Deployment, and Evaluation of The Slim Fly Network

SF Cluster Illustration

In this work we present the first-ever real-world installation of the Slim Fly topology.

Slim Fly was the first network that lowered cost and power consumption, while simultaneously improving performance, by reducing the network diameter to two, promising significant improvements over established interconnects such as Fat Trees. However, until now, it had not been tested in practice. We address this by deploying the first at-scale Slim Fly installation and establish routines to guide future deployments of much higher scales.

Additionally, we introduce a novel high-performance routing architecture. While used with Slim Fly in our work, this architecture is fully portable and applicable to enhance any low-diameter interconnect.

Our benchmarks highlight the combined capabilities of Slim Fly and our novel routing in executing today’s challenging workloads, from LLM training to graph analytics.

cognote.ai

I co-founded cognote.ai, a startup aiming to address the growing documentation burden in healthcare by automating documentation using structured converation summarisation based on speech recognition and deep NLP.

We scaled the team to 10+ members and executed a large data-collection and labeling study for this domain with over 1000 participants in collaboartion with the LMU Klinikum.


Past Employment

Riken logo

RIKEN Center for Computational Science (Kobe, JP): worked on LLM powered autonomous agents in scientific simulations with Dr. Jens Domke.

Huawei logo

Huawei Technologies (Zurich, CH): worked on routing and scheduling algorithms for the 3D-Torus topology with specific focus on distributed training of large language models with Dr. Anastasios Zouzias.

ETH logo

Scalable Parallel Computing Laboratory (Zurich, CH): worked on novel routing algorithms for low-diameter topologies deployed on Infiniband architecture with Dr. Maciej Besta and Prof. Torsten Hoefler.

Amazon logo

Amazon (Madrid, ES): worked on an automated testing service for Amazon Business' product filtering system with Dr. Eduardo Bezerra.

Oracle logo

Oracle Labs (Zurich, CH): worked on scaling computations to Hadoop clusters by enabling Spark and PySpark interpretation within the Data Studio environment with Michiel Haisma.


Publications

Topologies of Reasoning: Demystifying Chains, Trees, and Graphs of Thoughts
M. Besta, F. Memedi, Z. Zhang, R. Gerstenberger, N. Blach, P. Nyczyk, M. Copik, G. Kwasniewski, J. Müller, L. Gianinazzi, A. Kubicek, H. Niewiadomski, O. Mutlu, T. Hoefler.
arXiv‘24


A High-Performance Design, Implementation, Deployment, and Evaluation of The Slim Fly Network
N. Blach, M. Besta, D. De Sensi, J. Domke, H. Harake, S. Li, M. Konieczny, P. Iff, K. Lakhotia, A. Kubicek, M. Ferrari, F. Petrini, T. Hoefler.
NSDI‘24


Graph of Thoughts: Solving Elaborate Problems with Large Language Models
[First two authors contributed equally] M. Besta, N. Blach, A. Kubicek, R. Gerstenberger, L. Gianinazzi, J. Gajda, T. Lehmann, M. Podstawski, H. Niewiadomski, P. Nyczyk, T. Hoefler.
AAAI‘24


HOT: Higher-Order Dynamic Graph Representation Learning with Efficient Transformers
M. Besta, A. C. Catarino, L. Gianinazzi, N. Blach, P. Nyczyk, H. Niewiadomski, T. Hoefler.
LoG‘23


SMaRTT-REPS: Sender-based Marked Rapidly-adapting Trimmed & Timed Transport with Recycled Entropies
T. Bonato, A. Kabbani, D. De Sensi, R. Pan, Y. Le, C. Raiciu, M. Handley, T. Schneider, N. Blach, D. Alves, M. Papamichael, A. Caulfield, T. Hoefler.
arXiv‘24


PolarStar: Expanding the Scalability Horizon of Diameter-3 Networks
K. Lakhotia, L. Monroe, K. Isham, M. Besta, N. Blach, T. Hoefler, F. Petrini.
SPAA‘24


The Graph Database Interface: Scaling Online Transactional and Analytical Graph Workloads to Hundreds of Thousands of Cores
M. Besta, R. Gerstenberger, M. Fischer, M. Podstawski, J. Müller, N. Blach, B. Egeli, G. Mitenkov, W. Chlapek, M. Michalewicz, T. Hoefler.
SC‘23

🏆 Best Paper Finalist


GDI: A Graph Database Interface Standard
M. Besta, R. Gerstenberger, N. Blach, M. Fischer, T. Hoefler.
GDI Specification


SISA: Set-Centric Instruction Set Architecture for Graph Mining on Processing-in-Memory Systems
M. Besta, R. Kanakagiri, G. Kwasniewski, R. Ausavarungnirun, J. Beránek, K. Kanellopoulos, K. Janda, Z. Vonarburg-Shmaria, L. Gianinazzi, I. Stefan, J. Gómez-Luna, M. Copik, L. Kapp-Schwoerer, S. Di Girolamo, N. Blach, M. Konieczny, O. Mutlu, T. Hoefler.
MICRO‘21


Towards Automated Anamnesis Summarization: BERT-based Models for Symptom Extraction
[First three authors contributed equally] A. Schäfer, N. Blach, O. Rausch, M. Warm, N. Krüger.
NeurIPS‘20 ML4H