Kilian Haefeli
Hi, I am Kilian. I am a Machine Learning Engineer and Researcher studying at ETH Zurich.
I am interested in Large Neural Networks, their training and generalization dynamics and how to scale them.
I previously interned Aleph Alpha and worked on Diffusion Models and Graph Neural Networks.
In another life I was a founding engineer at airica which we sold to logitech.
I occasionaly write about stuff: Blog
Email /
GitHub /
Google Scholar /
LinkedIn /
CV
|
|
Research
Currently I am researching on establishing the understanding the emergence of In Context learning and how it fits into our broader understanding of generalization.
|
|
Efficient Neural Representation Learning for Star-Convex Boundaries
Kilian Haefeli
, 2022
website /
A neural representation model for predicting the temporal evolution of a phase boundary generated during 3D-printing.
We use a direct star-convex parameterization of the phase boundary vs learning the underlying temperature field. The parameterization at a single time is learned by a Graph Neural Network and over time as an RNN.
|
|
Diffusion Models for Graphs Benefit From Discrete State Spaces
Kilian Haefeli, Karolis Martinkus, Nathanaƫl Perraudin, Roger Wattenhofer
Learning on Graphs Conference and NeurIPS 2022 GLFrontiers Workshop, 2021
arxiv /
code /
website /
Diffusion Model for Graphs using discrete Bernoulli Perturbations over edge connections.
This approach results in maintained sparsity, and sampling with much less steps resulting in new SOTA graph generation.
|
Experience
Working with fast paced and sharp minded people has been one of the greatest experiences.
|
|
Aleph Alpha
2023-10-01 / 2024-01-14
LLM Engineer Intern
website /
Working on Retrieval Augmented Generation for a chat application.
|
|
Logitech
2022-06-01 / 2022-10-01
Junior Data Scientist
website /
Built Recurrent Neural Nets for time-series prediction of Co2 concentration and room occupancy.
|
|
Airica
2020-05-01 / 2022-09-01
Co-Founder & Junior Data Scientist
website /
Together with my friend Lukas Limacher and Vassilis Kalofolias, I started an IoT company specializing in models for predicting meeting room occupancy.
The company was acquired by Logitech in 2022.
|
|
University of Toronto Exchange
2023-12-01
Attending UofT ECEE as a Graduate Exchange Student learning about Information Theory, Statistical Learning Theory and Parallel Systems.
|
|
ETH Zurich Masters, EECS
2022-10-01
Attending EECS masters, focussing on optimization and theory of Neural Networks as well as systems for Transformers.
|
|
ETH Zurich Bachelors, EECS
2019-10-01
Coursework focused on Systems, Algorithms and Machine Learning, with special focus on generative models.
|
|
Flash Attention in C CUDA
2024-04-01
website /
A cuda C implementation of Flash Attenton without using any libraries such as cublas or cutlass. In ~300 lines of code this kernel is faster and more memory efficient than the standard PyTorch attention module.
|
|
Attention in C CUDA
2024-03-01
website /
A cuda C implementation of the Attention operator. Multiple increasingly optimized versions of Matrix Transpose, Matmul and Softmax kernels are provided.
|
|
Optimizable DeepPoly
2022-10-01
website /
Implemented an optimizable Neural DeepPoly Verifier implemented as torch Modules. Compatible wth any Optimizer Setup that PyTorch has to offer. This serves to efficently and tightly verify neural networks on adversarial robustness.
|
|