Course description and objectives
This course will cover a variety of statistical forecasting methods that are applicable in many functional areas of business, including simple and multiple regression, exponential smoothing, seasonal decomposition, and ARIMA models. The emphasis will be on learning to apply these methods to real data using a full-featured microcomputer statistics package (Statgraphics) and a spreadsheet package (Excel). The course objective is for you to gain competence in using these forecasting methods as well as in general statistical data analysis and computer modeling. By the end of the course you should know how to collect data from diverse sources, massage it into a form which can be analyzed on the computer, perform various kinds of exploratory data analysis, and ultimately generate forecasts from it by selecting and fitting an appropriate statistical model. We will also discuss some of the managerial issues surrounding the use of forecasting models in business and the challenges posed by recent dramatic developments in the U.S. and global economies. Concepts of time series analysis introduced in this course should prove helpful in courses and professional work in finance, marketing, operations, and consulting. Prerequisites: A general familiarity with basic statistical concepts and an interest in using the computer for numerical detective work.
Our standard software for the
course will be Statgraphics Plus
for
Windows version 5. Under the terms of our site
license,
Statgraphics is available for free home installation by Duke
students.
The software and installation instructions are are available here on
the Duke OIT web site.
The two Decision 411 CD's contain a tutorial on how to use Statgraphics in the form of screencam (Camtasia) movies. Just follow the directions in the readme file on the first CD. For a tutorial in hard-copy form, the handout called "Statgraphics 5: Overview & Tutorial Introduction" (included in the preassignment pack and also available here) contains all essential instructions for getting started with Statgraphics. You should work through either the first few screencam movies or the tutorial handout before or during the first week of class, to be prepared for the first homework assignment.
You should install the software and test it by working through the exercises in the "Tutorial Introduction" handout by the end of the first week of class: do not put yourself in the position of finding out that the software won't run properly on your machine on the night before the first assignment is due!
We will also be using Excel
to some extent: I assume you are
already rather familiar with that. Powerpoint is the recommended medium
in which to prepare the presentations of homework assignments and final
projects. Word is also
useful for some kinds of data-cleaning operations (e.g.,
search-and-replace in text files), as well as for
writing
reports.
We will make heavy use of information technology in Decision 411. My office hours will be conducted electronically via e-mail (robert.nau@duke.edu) and the main class bulletin board. If you have a question for which you need a response directly from me, send it to me in an e-mail note.
I will try to respond to e-mail questions as soon as I can, usually within 24 hours or less, although I do not promise an answer to every question. Sometimes the point of a homework assignment is to learn by "muddling through" the analysis on your own. If you ask a particularly good question, or if several people ask the same question, I will usually post both the question (minus your name) and my response on the bulletin board for the benefit of everyone else. You should therefore always check the main bulletin board before e-mailing me with a question, to see whether the answer has already been posted. I recommend that you check the bulletin board at least a couple of times each week to keep up with Frequently Asked Questions.
If you have a comment of general interest or a question that could perhaps be answered by another student, you can post it on the Decision 411 bulletin board yourself. (If it is a time-sensitive question expressly for me, do NOT post it on the bulletin board--send it by e-mail instead.) For example, if problems arise with the software or data files--which I hope they won't--the bulletin board should be used to alert everyone and suggest solutions. You can also use the bulletin board to ask and answer questions about how to use statistical methods or software in generic situations. But please don't post anything that reveals the details of your approach to a homework problem. For example, don't post a public note that says something like "On assignment 3 I regressed X on Y and Z and got an R-squared of 23%, but there was a humongous outlier in period 68. Is this a problem?" As an honor code matter, you should observe the same rules of privacy on the bulletin board as would apply to conversations in the hallway or the lab with students other than your study partner.
I will use the (internal) Decision 411 course page for posting lecture notes and data files. I will hand out hard copies of the lecture notes on a lecture-by-lecture basis. You can print them off yourself from the web site in advance if you wish, but I may edit or expand them as we go along, so the version handed out in class may be more current than what is out there right now. I plan to maintain the Decision 411 web site indefinitely, and I will continue to expand and update it. In the future, if you want to review the course-related material that has been posted here, I suggest that you consult the web page rather than your old printed copy so that you will get the most complete and up-to-date information.
There will be three "regular" homework assignments during the term, an in-class quiz, and a final project (or fourth homework assignment) to be handed in during finals week. The approximate grading basis of the course will be as follows:
This is a "hands-on" course: most of the learning will be derived from doing the homework assignments.
There will be three regular
homework assignments during the
term--see
the course outline for due dates. (The assignments will generally be
due
on Tuesdays). The homework assignments will all be computer-based,
involving
the development of forecasting models for one or more data sets. You
may
collaborate with one other student on a homework
assignment--i.e.
the maximum size of a study group is two persons. (See study
group policy above.)
Assignments will be submitted electronically through the
appropriate links that will appear on the Decision
411 course page when the time comes. For each assignment you
should submit the
following three files:
(i) a Powerpoint or Word file containing your presentation, (ii) your
Statgraphics data file, and (iii) your Statgraphics statfolio
file. (Please note that you must submit both your data file and your
statfolio file in order for me to be able to audit your work.)
Numbers that are cited in your presentation are often best presented in the form of tables, particularly where features of different models are being compared against each other. In many cases, the tabular reports automatically produced by statistical or spreadsheet software--especially the "Analysis Summary" and "Model Comparison" reports from Statgraphics--are ideal for this purpose, but in some cases you may have to create your own tables. Additional guidelines for writeups will be handed out in class with the first assignment.
Grading of assignments will be based on the correctness of the analysis and on the organization, clarity and effectiveness of the presentation. You should think carefully about which charts and tables to present, and in what order, and how to format and annotate them for maximum informativeness. (However, you should avoid using complicated backgrounds, fancy fonts, drop shadows, click art, and other forms of "chart junk" which merely distract attention away from the data.) Your goal should be to explain how you analyzed the data and what you learned from it in as clear and concise a manner as possible, and to leave just enough of an audit trail to explain how you arrived at your bottom-line numbers and conclusions.
Homework grading criteria:
Homework assignments will be graded on the following criteria
At the end of the course a final project will be due. This may be a data analysis project of your own devising--e.g., an analysis of a data set of your choice. Ideally, the aim of the final project will be to develop a forecasting model for a data set in which you are especially interested, although other kinds of statistical analysis are also admissible. The important thing is that the project should have significant data analysis content and should use at least some of the modeling concepts introduced in the course. A project may be designed in connection with work in another course, subject to the approval of both instructors. Alternatively, there will be several "designated project" options for those who don't want to design their own projects. The designated project will essentially consist of a fourth homework assignment, and will emphasize regression and ARIMA models.
You may collaborate with one other person--e.g., your homework partner--on a final project. (See study group policy above.)
If you design your own project, you are responsible for defining the objectives and collecting the data. The Economagic database and other library databases at Fuqua may be good sources of general economic and financial data, and as the term progresses we will also be looking for new data sources on the World Wide Web. The presentation is expected to be more substantial than one of the later homework assignments by virtue of the fact that it should include a description of the data and a summary of the background research or experience that motivated the analysis.
Students wishing to design their own projects are advised to begin researching and collecting data by the start of the 5th week of class and should submit written descriptions of their proposed projects to the instructor by the beginning of the last week of class. Keep in mind that the hardest part of doing a project--or any statistical data analysis--is obtaining the data in the first place. Try to find out as soon as you can whether the data that you need is going to be obtainable in time to finish a project by the end of exam week. Ideally, you should have your data in hand and partially analyzed by the time you submit your written project description.
Final projects are to be submitted electronically, following the same general guidelines as for homework assigments.
As an honor code matter, it is expected that students will not discuss details of the data analysis parts of assignments with any students other than their study-group partner (if any), except for tutors or RA's designated by the instructor. However, students are permitted and even encouraged to help each other understand general concepts of data analysis, properties of forecasting models, and features of software, as long as the discussion is unrelated to the details of the current assignment. For example, it's OK to ask a classmate how to interpret a particular statistic or type of graph in the context of a generic problem or a past assignment, but not as it applies to the data in the current assignment.
No course materials from previous years (HW papers, projects, exams, etc.) may be used other than those handed out or placed on reserve by the instructor.