Paper Accepted at the ICML 2024 Workshop on Mechanistic Interpretability

We have a paper accepted at the Workshop on Mechanistic Interpretability located at ICML this year!

Bortoletto, M., Ruhdorfer, C., Shi, L. & Bulling, A. (2024). Benchmarking Mental State Representations in Language Models. ICML Workshop on Mechanistic Interpretability.

Thanks to my co-authors and especially the first author, Matteo Bortoletto.