CSC2541-f18 Course Project 1 Logistics For the course project, you will implement a research idea on the topic of deep rein- forcement learning. The purpose of the final project is to give you some experience working on a piece of original research and writing up your results in a paper style format. You are expected to describe your research idea/application clearly in the project proposal, relate to existing work. You will document the project progress in the final report. You must form a group of two or three to complete the project. There will be a short presentation at the end of the semester to showcase your work. There are two important dates: the initial project proposal is due 11:59 pm Oct 14th , and the final report is due 11:59 pm Dec 16th . The write-ups are to be submitted to csc2541-f18-teaching@cs.toronto.edu with the title [CSC2541 Pro- posal/Report]. The policy regarding late submission can be found in Section 4. 2 Writing format All submissions must be in PDF format. You may include algorithm blocks, tables, and figures in the write-ups. All the write-ups should be prepared in the NIPS paper format: https://nips.cc/Conferences/2018/PaperInformation/StyleFiles. Proposal: The project proposal is limited to two pages. It should roughly have the following sections: • 1/2 page introduction • 1/2 page related works • 1/2 page method / algorithm • 1/4 page abstract and reference Final report: The final report needs to have at least four pages of contents ex- cluding the references. You will expand out your project proposal to include ex- periments and comprehensive method sections. You are expected to discuss the experimental results in details and highlight any interesting findings.
3 How to choose a project The course projects should involve some form of reinforcement learning. You are encouraged to use neural networks as the function approximators for your method or application. There are two categories of projects to choose from. Understanding and analysis : For the beginner who would like to have a more in- depth understanding of deep reinforcement learning algorithms, it is often a good idea to re-implement an existing method and re-evaluate the implementation against some standard benchmarks. • Reproduce the experimental results from some existing papers. Perform sensitivity analysis on hyperparameters. • Apply / extend existing algorithms to a new application / task / game. If you choose to work on this of this category, you will need to implement and analyze the performance of at least two different deep RL algorithms / methods in at least two different task domains, e.g., MuJoCo locomotion control and Atari Arcade Games. You are asked to discuss the strength and weakness of each of the approaches backed by your experimental findings. Doing a proper analysis for the existing methods is non-trivial. Here are two great examples of this type of study: Visualizing and Understanding Convolutional Net- works https://arxiv.org/pdf/1311.2901.pdf and https://arxiv.org/pdf/1506.02078.pdf https://arxiv.org/pdf/1506.02078.pdf Exploratory research : You may also choose to work on a novel research idea that may lead to a potential publication. The examples of such projects are: • Improve / fix an existing algorithm. Evaluate the improvement on bench- mark environments. • Develop novel model architectures / algorithms to a new application / area / environment. If you decide to work on a research idea, you will need to implement and compare the performance of your method against at least one existing approach in your problem. Here is some advice on picking a good research problem from Bill Freeman: https://billf.mit.edu/sites/default/files/documents/cvprPapers.pdf and from part III and IV of the David Patterson’s slides: https://people.eecs.berkeley.edu/ pat- trsn/talks/BadCareer.pdf. 4 Grading scheme • Project proposal 20% total marks 2
• Final project report 60% total marks You may receive full marks for the course project by choosing either of the two categories. There will be no advantage for choosing an exploratory project over an analysis one regarding achieving a higher grade. The goal of the project is for you and your group to conduct original research. The proposal and the final report will be graded according to the criteria of top machine learning conference submissions. We will use the NIPS review criteria for this purpose: • Quality Is the paper technically sound? Are claims well-supported by the- oretical analysis or experimental results? Is this a complete piece of work, or merely a position paper? Are the authors careful (and honest) about eval- uating both the strengths and weaknesses of the work? • Clarity Is the paper clearly written? Is it well-organized? (If not, feel free to make suggestions to improve the manuscript.) Does it adequately inform the reader? Are the figures/tables properly labeled? (A superbly written paper provides enough information for the expert reader to reproduce its results.) • Originality Are the problems or approaches new? Is this a novel combina- tion of familiar techniques? Is it clear how this work differs from previous contributions? Is related work adequately referenced? We recommend that you check the proceedings of recent NIPS conferences to make sure that each paper is significantly different from papers in previous proceedings. Abstracts and links to many of the previous NIPS papers are available from http://books.nips.cc. • Significance Are the results important? Are other people (practitioners or researchers) likely to use these ideas or build on them? Does the paper ad- dress a difficult problem in a better way than previous research? Does it advance the state of the art in a demonstrable way? Does it provide unique data, unique conclusions on existing data, or a unique theoretical or prag- matic approach? Late submission, except in the case of an official Student Medical Certificate, will be accepted with 25% penalty every 24 hours from the deadline. So, you will get 0% on the assignment if it is submitted 4 days late. Each group has 48 hours “grace budget” for the entire semester that can be used across the proposal and final project without penalty. The write-ups not submitted electronically before the deadline will be considered late. We will use the time-stamp of the electronic submissions to count for lateness. The time past the deadline will be rounded up in base 24 hour. 3
Recommend
More recommend