Brackets, Braces, Comma-Separated Lists and All That

Contents

Recap

So far, we have three pretty complicated data types to deal with: matrices, cell arrays, and comma-separated lists. We've also managed to generate a lot of confusion about which braces to use and when. This lesson is an attempt to straighten all that out, and to serve as a companion to the Data Slicing Cookbook.

Let's start with a simple example: We have an array of structs, called S, each of which has fields name (a string), number (a number), tags (a cell array), and ss, a structure that itself has a single numerical field, ff. We want to collect all the names, numbers, and tags lists in separate variables.

So let's set up the array:

S(1).name='Joe';
S(1).number=5;
S(1).tags={'male','control'};
S(1).ss.ff=4;

S(2).name='Sue';
S(2).number=1;
S(2).tags={'female','treatment'};
S(2).ss.ff=3;

S(3).name='Jean';
S(3).number=24;
S(3).tags={'dnf'};
S(3).ss.ff=8;

This array has three elements, each of which is a structure. Each structure has the same fields (as they must, or we can't concatenate them into an array).

But how in the world do we get the data out? We know that typing

S.name
ans =

Joe


ans =

Sue


ans =

Jean

returns all the names, but how do we put this in a vector? To find the answer, we'll have to back up and introduce our next big Matlab concept: the comma-separated list.

Comma-Separated Lists

A comma-separated list is a list of variables, separated by commas:

a=1;
b=2;
c=5;
a, b, c
a =

     1


b =

     2


c =

     5

Look familiar? When we type the names of the three variables, separated by commas, Matlab just views this as three commands entered on the same line. But this is the same output displayed above for the structs. In other words, we have a clue as to what type of data Matlab is returning when we type S.name.

Our second piece of evidence comes from functions. When we call a function like size with more than one input,

size(S,1) %how many rows does S have?
ans =

     1

we are passing the function a list of variables separated by commas — a comma-separated list!

Finally, we've also seen an example of comma-separated list produced in conjunction with cell arrays. If we have a cell array C

C={'first','second','third'};

then

C{:}
ans =

first


ans =

second


ans =

third

produces output we now know to be the equivalent of

C{1},C{2},C{3}
ans =

first


ans =

second


ans =

third

that is, a comma-separated list!

But the most primitive example of a comma-separated list we didn't even recognize as one at the time. We used it to define a vector:

v = [a, b, c, 5, 9]
v =

     1     2     5     5     9

This command tells Matlab to take the variables and literals and concatenate them all into a vector, then assign this vector to a variable called v.

But we can also write this as follows:

cat(2,a,b,c,5,9) %the first number tells us to concatenate along dimension 2 (columns)
ans =

     1     2     5     5     9

Bit by, we're working our way down to how Matlab actually translates its shorthand syntax (brackets, braces, etc.) into function calls that do all the dirty work.

Here, we see that surrounding an expression with brackets simply calls the concatenate function on the value returned by that expression. In the examples above, the comma-separated list returns the sequence of values to be concatenated. We might say, in this case, that the comma-separated list has been captured by the brackets to form a vector that we can save as a single variable.

Capturing Comma-Separated Lists

Let's return to the problem we set ourselves at the start: getting data out of an array of structures.

Now we know that when we type

S.number
ans =

     5


ans =

     1


ans =

    24

what is being returned to us is a comma-separated list, the same as if we'd typed

S(1).number, S(2).number, S(3).number
ans =

     5


ans =

     1


ans =

    24

So how do we collect this into a variable? We simply capture the comma-separated return values in brackets, concatenating them into a vector:

n = [S.number]
n =

     5     1    24

The concept is a little abstruse, but very powerful. With this simple expression, we are asking Matlab to generate a comma-separated list of the values stored in the number field of each structure in S (S.number), concatenate these into a vector (the brackets), and store the result in a variable called n (n =).

The syntax can be a pain to understand, but once you've internalized the concept, you'll be able to write very powerful, very efficient code that manipulates data into more usable forms.

But we're not done. There are additional twists on this story, some of which we've already seen but have yet to explain.

Advanced Capture

Let's start with the next most complicated part of our initial dilemma: capturing the structure field ss. Because what we want to do is the equivalent of writing

[S(1).ss S(2).ss S(3).ss] %put all the structure fields in their own array
ans = 

1x3 struct array with fields:
    ff

we can simply reuse the capture strategy from above:

s = [S.ss]
s = 

1x3 struct array with fields:
    ff

Again, this works on two very simple principles:

  1. when we don't index a structure array, the dot notation gets us a comma-separated list of field values, one for each structure
  2. this comma-separated list can be captured (concatenated) by surrounding it with brackets

But what about the names. Typing

nm = [S.name]
nm =

JoeSueJean

certainly concatenates the names, but not in the way we want. As we learned early on, concatenating strings with normal vector operators (brackets) leads to a single concatenated string.

But there is (invevitably) a solution. When we need to store groups of strings together, we use cell arrays, which don't require all the strings to be the same length.

For instance, we would write:

{'Joe','Sue','Jean'}
ans = 

    'Joe'    'Sue'    'Jean'

But here again, we see that concatenating into cell arrays works just like vector concatenation except instead of brackets, we use braces!

So why not try the same capture strategy as above, except, instead of trying to capture into a vector with brackets, we capture into a cell array using braces:

nm = {S.name}
class(nm)
nm = 

    'Joe'    'Sue'    'Jean'


ans =

cell

We see that nm is of class cell array, just as we wanted, and contains all the names as its elements.

By now, then, the solution to the final part of our problem should be clear. If I want to store all the tags separately, I will need to do so in a cell array, since each group of tags is a cell array, and I want to keep tags corresponding to different entries of S separated. That is, we want

{S(1).tags,S(2).tags,S(3).tags}
ans = 

    {1x2 cell}    {1x2 cell}    {1x1 cell}

which is very easy to get by entering

t = {S.tags}
t{1}
t{2}
t{3}
t = 

    {1x2 cell}    {1x2 cell}    {1x1 cell}


ans = 

    'male'    'control'


ans = 

    'female'    'treatment'


ans = 

    'dnf'

Summary