This is an archive of a course I taught Fall 2019, preserved here as a resource for future students.

ECE 566: Enterprise Storage Architecture

Section 01, Fall 2019


The invention of RAID probably ruined this guy's life.

Overview

Lecture location: Hudson 212
Lecture time: MW, 1:25PM - 2:40PM

Instructor: Dr. Tyler Bletsch
Email: Tyler.Bletsch AT duke.edu
Office Hours: Monday 3-4pm and Tuesday 1:30-2:30pm, Hudson 106 (or by appointment - feel free to email!)

Teaching Assistant: Bonan Yan (bonan.yan AT duke.edu)
TA Office Hours: By appointment

Links:

Schedule

#DateLectureAssignment due
(11:55:00pm)
Project due
(11:55:00pm)
1 Mon 8/26 Course introduction and policies
2 Wed 8/28 Overview of storage systems and project discussion
3 Mon 9/2 Hard disks, SSDs, and the I/O subsystem
4 Wed 9/4 Hard disks, SSDs, and the I/O subsystem (Fri 9/6)
Lab 0
5 Mon 9/9 Hardware failure in storage devices
6 Wed 9/11 RAID (Wed 9/11)
Program 0
7 Mon 9/16 Network-Attached Storage (NAS)
8 Wed 9/18 Storage Area Network (SAN) (Due Wed 9/18)
Lab 1,
Homework 1,

9 Mon 9/23 Overture (later topics summarized for project planning),
Filesystems
10 Wed 9/25 Filesystems (Due Fri 9/27)
Program 1

11 Mon 9/30 Filesystems
12 Wed 10/2 Storage efficiency (Due Wed 10/2)
Project initial proposal,
instructor meeting scheduled
Mon 10/7 Fall break (Thu 10/4 - Fri 10/11)
Project proposal meetings
13 Wed 10/9 Business continuity: High availability (Due Thu 10/10)
Lab 2,
Homework 2
14 Mon 10/14 Business continuity: High availability (Due Mon 10/14)
Project final proposal
15 Wed 10/16 Business continuity: Disaster recovery (Due Fri 10/18) Project status report 1
16 Mon 10/21 Business continuity: Disaster recovery
17 Wed 10/23 Project workday (come prepared to work!) (Due Fri 10/25) Project status report 2
18 Mon 10/28 Virtual environments
19 Wed 10/30 Midterm exam (Due Fri 11/1)  Project status report 3
20 Mon 11/4 .~*CLOUD*~.
21 Wed 11/6 Project workday (come prepared to work!) (Due Wed 11/6)
Lab 3,
Homework 3
(Due Fri 11/8)  Project status report 4
22 Mon 11/11 Workload profiling and sizing
23 Wed 11/13 Security (Due Fri 11/15) Project status report 5
24 Mon 11/18 Data forensics and recovery
Tue 11/19:
NetApp field trip
25 Wed 11/20 Next-gen storage technologies (Due Fri 11/22)
Project materials and report,
demo meeting scheduled
26 Mon 11/25 Project final presentations (!) ↓ Posted 11/4
(Due Mon 11/25)
Lab 4,
Homework 4
(mystery_app,
disk-images.tgz)
(Mon 11/25 - Fri 12/6)
Project demo meetings
Sun 12/15 Final exam: 7pm-10pm

Field trip to NetApp

We'll be doing a field trip to NetApp. There, we'll hear about the storage controller software development as well as architecting complete customer environments and get a tour of their giant datacenter.
  • Date: Tuesday, 19 November 2018
  • Time: 5:30pm to 7:45pm
  • Carpool: Meets at 4:40pm in lobby outside Schiciano
  • Address: 6741 Louis Stephens Dr, Cary, NC 27519
  • Event Schedule:
    • 5:30-5:45pm: Arrival at GDL2
    • 5:45-6:30pm: GDL2 Tour
    • 6:30-6:45pm: Drive back to Building 1 + Catering Arrives
    • 6:45-7:45pm: Engineering Panelists and HR Overview
    • 7:45pm: Depart

Syllabus & policies

Course synopsis

A chance to study the design and deployment of massive storage systems of the sort used in large enterprises (banks, major IT departments, service providers, etc.). Includes coverage of hard disk and flash design, RAID, SAN and NAS topologies, filesystem design, data center architectures for high availability, data deduplication, business continuity, power aware storage, and the economics of data storage with respect to cloud computing.

Assignments include hands-on lab work with physical servers, some pen-and-paper problems, and semester-long programming project.

Pre-requisites for grad students: ECE 650 (Systems Programming and Engineering) or instructor consent.

Pre-requisites for undergrad students: Computer Science 310/ECE 353 (Operating Systems). Will also need basic networking knowledge (IP addressing, that network switches exist, layer 2 vs layer 3). This can be provided by ECE/COMPSCI 356 (Network Architecture), personal experience, or self-education in parallel with the course.

If you feel you have an OS and networking background but are missing the above pre-reqs, just contact me.

Grading breakdown

This course will require a semester-long project, homework assignments, and a final exam. Grading breakdown:

Category%
Project initial proposal2%
Project final proposal3%
Project status reports 5%
Project final report10%
Project final presentation5%
Project final demo20%
Homeworks/programs/labs45%
Final exam10%

Homework and Labs

There are two kinds of regular assigments, homeworks and labs:

In either case, you are free to discuss concepts covered in the class with others (other groups for lab work and other people for individual homework), but should not share answers or concrete steps oustide the bounds of academic integrity.

Late homework/lab submissions incur penalties as follows:

NOTE: If you feel in advance that you may need an extension, contact the instructor. We can work with you if you see a scheduling problem coming, but extensions cannot be granted at or near the due date!

Your homework/lab grade will be based on what you submit to Sakai and when you submit it.

Student servers

To support experimentation on real hardware, several storage servers have been procured. Students will split into groups of ~3 and each will be assigned a server. Homeworks will guide students through physically examining, racking, installing, configuring, and using the servers in realistic scenarios.

Some of the hardware is a little dated, but it will exhibit all the usual performance trends, and it has the drives to experiment with RAID topologies, hybrid HDD+SSD storage, filesystem performance, and more. Budget does exist for upgrades if needed (e.g., adding a modern HDD for a comparitive performance study).

The servers will start out in Hudson 06, but after setting them up, students will install these servers into a standard four-post rack. For this purpose, rack space has been set aside in the "FitzWest" data center in the basement of CIEMAS; students will be granted badge access to this space for this purpose.

NOTE: FitzWest is a real production data center for Duke. Students must exercise caution when working in this space, taking care not to disturb operations of other systems.

In order to guide students through the early phases of server configuration, a few out-of-class lab sessions will be scheduled at a mutually agreeable time. Once servers are configured and deployed properly, all subsequent operations should be able to be conducted over the internet. However, if a physical malfunction occurs (such as drive failure or accidentally trashing the installed OS), students may need to do in-place maintenence of their server within its FitzWest rack.

Any major hardware failures should be reported to the course instructor.

Grade appeals

All regrade requests must be in writing. Email the TA with your questions. After speaking with the TA, if you still have concerns, contact the instructor.

All regrade requests must be submitted to the instructor no later than 1 week after the assignment was returned to you.

Academic integrity

I take academic integrity extremely seriously. Academic misconduct will not be tolerated, and all suspected violations of the Duke Honor Code will be referred to the Office of Student Conduct (for undergraduates) or the departmental Director of Graduate Studies (for graduate students). A student found responsible for academic dishonesty faces formal disciplinary action, which may include suspension. A student twice suspended automatically faces a minimum 5-year separation from Duke University.

In addition to the measures taken by the university, the affected assignment(s) will receive zero credit, or possibly -100% in egregious cases.

If you are considering this course of action, please see me instead, and we can work something out! I want every student in my course to be successful.