Importing and Exporting Data

My typical advice is that, if you're doing your analysis in Matlab, you can focus on importing data and forget about exporting it. But in reality, it's helpful to know how to do both, and the commands are quite similar.

Unfortunately, Matlab is not nearly as versatile when it comes to data import/export as a language like R, which was designed for statisticians who deal with data in lots of forms. In this lesson, we'll focus on a simple example, and I'll point you to more complicated functions at the end.

Contents

Pick Your Poison

This is going to be a complicated discussion for a few reasons:

Sorry.

Thus, in general, you will need to do one of the following:

As a result, I'm going to list several options below. You'll have to play around to figure out which is best for you.

Getting Started

First you'll need to download the data (xls xlsx csv). These are data extracted from the Multnomah County Jury Project, which tabulated jury decisions in Multnomah County, Oregon, in the period 1973-1976. The data are currently stored in a comma-separated value file (csv), which is a pretty typical and portable data format. (Excel will read and save files in this format, and it's actually human-readable, since each line is just the entries in each row, separated by a delimeter: commas, spaces, tabs, etc.)

Now move to the directory where the file is located. Those of you with a spreadsheet program installed may want to take a look at the data.

Pulling data

In its simplest form, we want to use one of the following commands to pull data

[num,txt,raw] = xlsread('jury_data.xlsx');
num = csvread('jury_data.csv',1,0);

Comments:

If we used the xlsread option, we can do some data cleaning (advanced):

dat=nan(size(raw)); %make an empty cell array
strtype = cellfun(@ischar,raw); %apply the ischar function to every element of raw
dat(~strtype) = cell2mat(raw(~strtype)); %copy over cells that are already numbers
dat(strtype) = str2double(raw(strtype)); %convert non-numeric cells to numbers, then copy

Why do we solve the problem like this? Or better yet: what should our thought process be when we approach such a problem? Some thoughts:

Two things to take away from this:

  1. There is a lot of Matlab left to learn, and unfortunately, it's sometimes needed for very basic things. But there aren't that many new concepts, just more functions to learn.
  2. Matlab can be really powerful. It only took four lines to solve a data cleaning problem that would be absurd to do by hand-editing.

Writing Data to Spreadsheet

Thankfully, writing data out is basically the reverse process of reading data in. For more information, see the help files listed below. The process should be self-explanatory, with the caveat that you need to be aware of the limits of what type of data can be written out using each command.

Loading and Saving Matlab Data

This one's easy. To load data simply use the load command:

load practice.mat   % you have to be in the directory containing practice.mat
load('practice.mat')  %also works

Now, if you're inside a function, load won't put variables directly into the workspace, since it might accidentally overwrite something existing. In that case (or anytime), you can get an output from load

myvars = load('practice.mat');  %returns a structure; each field is a variable

As for saving data, you do that with save, which takes as its argument an output file name and list of variables:

a=2;
b=1;
save myoutput.mat a b  %save variables a and b to myoutput.mat
save('myoutput.mat','a','b')  %does the same thing

Further Reading

xlsread

xlswrite

csvread

csvwrite

textscan