Written tests
There will be two written closedbook scored tests. We will test practical knowledge of things explained at lectures and/or lab sessions. We require neither programming nor mathematical proofs. Test questions focus on data analysis, learning algorithms, and evaluation.
Test #1 is shorter (45 minutes) and comes in the midst of the term. Final test #2 is more complex and takes 80 minutes. Any computers or calculators are not necessary and are not allowed. Everything will be answered using only pen and head.
Topics covered for the tests are listed below.
Test #1
 probability, conditional probability, statistical independence
 simple statistical data analysis: expected values, variation, correlation, median, quantiles
 confusion matrices, interannotator agreement
 classifier evaluation
 entropy, conditional entropy
 majority voting for ensemble classifiers
Test #2 – Final written test
Topics of the exercises span almost the whole course. You should be ready to answer questions related to data analysis, learning algorithms, and evaluation methods including statistical tests. Mathematical proofs and neural networks will not be required.
Requirements for obtaining the exam credit
Obtaining the course credit is a prerequisite for taking the examination in the course.
The questions for oral examination:

Machine learning – basic concepts. What is machine learning, motivation examples of practical applications, theoretical foundations of machine learning. Supervised and unsupervised learning. Classification and regression tasks. Training and test examples. Feature vectors. Target variable and prediction function. Machine learning development cycle. Curse of dimensionality. Bayes classifier and Bayes error.

Clustering algorithms. Hierarchical clustering, kMeans algorithm.

Decision tree learning. Decision tree learning algorithm, splitting criteria, pruning.

Linear regression. Least square cost function.

Instancebased learning. kNN algorithm.

Logistic regression. Discriminative classifier.

Naive Bayes learning. Naive Bayes classifier. Bayesian belief networks.

Support Vector Machines. Large margin classifier, soft margin classifier. Kernel functions. Multiclass classification.

Ensemble methods. Bagging and boosting. Unstable learning. AdaBoost algorithm. Random Forests.

Parameters in ML. Learning parameters tuning. Grid search. Gradient descent algorithm. Maximum likelihood estimation.

Predictor evaluation. Working with development and test data. Sample error, generalization error. Crossvalidation, oneleaveout method. Bootstrap methods. Performance measures. Coefficient of determination. Evaluation of binary classifiers. ROC curve.

Statistical tests. Statistical hypotheses, onesample and twosample ttests, chisquare test of independence and goodnessoffit test. Significance level, pvalue. Using statistical tests for classifier evaluation. Confidence level, confidence intervals.

Overfitting. How to recognize and avoid. Decision tree pruning. Regularization.

Dimensionality reduction. General principles of feature selection. Filters, wrappers, embedded methods. Feature selection using information gain. Forward selection and backward elimination. Principal Component Analysis.

Foundations of Neural Networks. Single Perceptron and Single Layer Perceptron – learning algorithms and mathematical interpretations. The architecture of multilayer feedforward models and the idea of backpropagation training.
Grading
Key requirements and contributions to the grade
 50% written tests
 20% homeworks
 30% oral examination