Navigation auf uzh.ch
Mostly an unordered collection of pieces of teaching material, writing or code. Feel free to poke around!
If you use any of the below material, please make sure to cite this source. Thanks!
I wrote an extensive list of papers, books, tutorials, websites, code etc. that I would recommend to get started with neural networks, NLP and MT.
Introduction to Neural Networks
I recently gave a general introduction to feed-forward neural networks. Mostly technical, bare-bones explanations and code, no high-level libraries.
Youtube Playlist Gradient-based Learning
I recently started a Youtube channel! There are 3 videos already, originally meant for my students in a 2019 course. The videos are supposed to give an intuition for what gradient-based learning is.
Youtube Playlist Decoding Algorithms
Another playlist where I explain sequence generation decoding algorithms such as beam search and constrained beam search.
Introduction to Machine Learning
Selected slide sets and exercises from my introductory course on machine learning. Requirements: high-school math, statistics and basics of Python programming. This specific course taught together with Phillip Ströbel - thanks Phillip!
Topic | Slide set | Exercises Notebook | |
---|---|---|---|
1 | Basic concepts of machine learning | 1.pdf (PDF, 8 MB) | |
2 | First classification algorithm: KNN | 2.pdf (PDF, 4 MB) | |
3 | First regression algoithm: linear regression | ||
4 | Cross-validation, Hyperparameter Search | ||
5 | Feature Extraction | ||
6 | Overview of important classification algorithms |
Introduction to Neural Machine Translation
Selected slide sets and exercises from my introductory course on neural machine translation. Requirements: fundamentals of machine learning, high-school math, statistics and basics of Python programming.
Some of the materials were developed together with Samuel Läubli.
Topic | Slide set | |
---|---|---|
1 | Introduction | 1.pdf (PDF, 13 MB) |
2 | Evaluation (this slide set by Samuel Läubli) | 2.pdf (PDF, 3 MB) |
3 | Preprocessing | |
4 | Statistical Machine Translation | |
5 | Linear Algebra, Differential Calculus | |
6 | Linear Models | |
7 | Feed-forward Neural Networks | 7.pdf (PDF, 7 MB) |
8 | Recurrent Neural Networks | 8.pdf (PDF, 4 MB) |
9 | Tensorflow | 9.pdf (PDF, 4 MB) |
10 | Encoder-Decoder Models | 10.pdf (PDF, 8 MB) |
11 | Attention Networks | 11.pdf (PDF, 5 MB) |
12 | Decoding (this slide set by Samuel Läubli) | 12.pdf (PDF, 663 KB) |
13 | Current Research / Recent Improvements | 13.pdf (PDF, 4 MB) |
14 | Summary | 14.pdf (PDF, 4 MB) |
Try our educational (= slow, unstable, but insightful) NMT tool, daikon. It's written in Tensorflow, and you will need a GPU to train models. Main authors are Samuel Läubli and myself.
Whatsapp Author Identification
Take text messages from your favorite Whatsapp group to train a system that classifies your friends!
Recipes for Sentence Classification with DyNet
Code that exemplifies neural network solutions for classification tasks with DyNet. On top of that, the code demonstrates how to implement a custom classifier that is compatible with scikit-learn's API.
Forward passes of several flavours of recurrent neural networks in Numy and Tensorflow.
A stripped-down version of the Moses repository, with only the scripts for preprocessing that most people still use.
Feed-forward neural networks with Numpy
Implementation of feed-forward networks only using Numpy. Thanks Joel for this idea!
Scripts that show how to train and use models with daikon, our educational NMT system (https://github.com/zurichnlp/daikon).
Scripts that show how to train and use models with Sockeye.
Guide to Scientific Writing (PDF, 265 KB)
Guide to writing as scientific thesis. Caution and Disclaimer: this guide is unfinished and I will probably never get to work on it again (or write a thesis the way I am suggesting in it! :-).
Crowd-sourcing and English-centric research in NLP (PDF, 1 MB)
Thoughts on the question whether Crowd-sourcing facilitates research in non-English NLP. Presented at the Jožef Stefan Institute, Ljubljana.
Report on Feasibility of Stand-off Markup in TEI Documents (PDF, 307 KB)
Technical report on how exactly to organize Text+Berg annotation layers into several XML files.
Cost-effectiveness of Games with a Purpose for Collecting NLP Annotations (PDF, 5 MB)
Are games with a purpose a cost-effective method of collecting annotations for NLP research?
Treatment of Aphasia with Melodic Intonation Therapy in Tone Languages (PDF, 199 KB)
A seminar paper describing the setup and premise for an experiment that would investigate the merit of melodic intonation therapy to treat aphasia in speakers of tone languages.
Acquisition of Negation in English (link coming soon)
Discussing hypotheses about acquisition of negation in English speakers.
Summary of our research at the Phonetics Lab I gave at a conference for acousticians. In German.
Classifying Audience Reactions from Text (PDF, 270 KB)
Using the awesome CORPS corpus of speeches to classify text into audience reactions. For instance: look at a piece of text and try to determine whether the audience laughed after hearing it. Methods in the paper are questionable, but the idea itself is valid I think and still uncharted territory I would say.
Typology of Nominal Plural Marking (link coming soon)
Looking at a sample of typologically diverse languages to analyze if and how they mark plural on nominal constructions.