Chapter 3

Data Frames

Is it true that this is a matrix?

  x y
1 1 A
2 2 B
3 3 C

No

because it is a collection of vectors of different types.


Is it true that this is a data frame?

  x y
1 1 A
2 2 B
3 3 C

Yes

because it is a collection of vectors of different types.


Is it true that this is a data frame?

1 1 A
2 2 B
3 3 C

No,

because the vectors have no names.


Is it true that this is a data frame?

data.frame(x=c("1", "2", "3"), y=c("A", "B", "C"))

Yes

because it is a collection of named vectors.


Is it true that this is a data frame?

data.frame(c("1", "2", "3"),c("A", "B", "C"))

Yes

because it is a collection of named vectors -- with really ugly names:

  c..1....2....3.. c..A....B....C..
1                1                A
2                2                B
3                3                C

Is it true that this is a data frame?

beatles <- c("John", "Paul", "George", "Ringo")
destinyschild <- c("Beyonce", "Kelly", "Michelle")
data.frame(beatles, destinyschild)

No

because it is a collection of named vectors of unequal lengths.


Is it true that this is a data frame?

beatles <- c("John", "Paul", "George", "Ringo")
monkees <- c("Micky", "Michael", "Peter", "Davy")
data.frame(beatles, monkees)

Yes

because it is a collection of named vectors of equal length.


What is the value of bands$beatles where bands is

data.frame(beatles=c("John", "Paul", "George", "Ringo"), monkees=c("Micky", "Michael", "Peter", "Davy"))

[1] John Paul George Ringo

because those are the values of the vector named beatles.


What is the value of bands$Beatles where bands is

data.frame(beatles=c("John", "Paul", "George", "Ringo"), monkees=c("Micky", "Michael", "Peter", "Davy"))

NULL

because there is no vector named Beatles in bands.


What is the value of bands$Beatles where bands is

data.frame(beatles=c("John", "Paul", "George", "Ringo"), monkees=c("Micky", "Michael", "Peter", "Davy"))

NULL

because there is no vector named Beatles in bands.


Is it true that this is a data frame?

data.frame(data.frame(slayers=c("buffy", "faith", "kendra")), data.frame(scoobies=c("willow", "xander", "giles")))

yes

because it is a combination of two data frames.


Is it true that this is a data frame?

data.frame(data.frame(slayers=c("buffy", "faith", "kendra")), data.frame(scoobies=c("willow", "xander", "giles", "anya", "tara", "dawn")))

yes

although it may not look like you expect it to:

  slayers scoobies
1   buffy   willow
2   faith   xander
3  kendra    giles
4   buffy     anya
5   faith     tara
6  kendra     dawn

Is it true that this is a data frame?

data.frame(data.frame(slayers=c("buffy", "faith", "kendra")), data.frame(scoobies=c("willow", "xander", "giles", "anya", "tara")))

no

because the data frame slayers and the data frame scoobies are of lengths that are not the same, nor is one evenly divisisble by the other.


Is it true that this is a data frame?

data.frame(capitols=c("Trenton", "Annapolis", "Sacramento"), representatives=c(12,8,53), bird=c("cardinal", "oriole", "quail"), row.names=c("New Jersey", "Maryland", "California"))

yes

because row names can be specified in data frames, though they do not have to be:

             capitols representatives     bird
New Jersey    Trenton              12 cardinal
Maryland    Annapolis               8   oriole
California Sacramento              53    quail

What is states[3,] where states is

data.frame(capitols=c("Trenton", "Annapolis", "Sacramento"), representatives=c(12,8,53), bird=c("cardinal", "oriole", "quail"), row.names=c("New Jersey", "Maryland", "California"))
             capitols representatives  bird
California Sacramento              53 quail

because California is the third row of states.


What is states["California",] where states is

data.frame(capitols=c("Trenton", "Annapolis", "Sacramento"), representatives=c(12,8,53), bird=c("cardinal", "oriole", "quail"), row.names=c("New Jersey", "Maryland", "California"))
             capitols representatives  bird
California Sacramento              53 quail

because California is the specified row in states.


What is states["California",2] where states is

data.frame(capitols=c("Trenton", "Annapolis", "Sacramento"), representatives=c(12,8,53), bird=c("cardinal", "oriole", "quail"), row.names=c("New Jersey", "Maryland", "California"))

53

because California is the specified row in states, representatives is the second column in states, and 53 is the value in row California, column representatives.


Is is true that this creates a data frame?

who <- data.frame(doctors=c(1, 2, 3), actors=c("William Hartnell", "Patrick Troughton", "Jon Pertwee"))
who <- rbind(who, data.frame(doctors=4, actors="Tom Baker"))

yes

because rbind adds a row to a data frame.


What is who[5,2] when who is

who <- data.frame(doctors=c(1, 2, 3), actors=c("William Hartnell", "Patrick Troughton", "Jon Pertwee"))
who <- rbind(who, data.frame(doctors=4, actors="Tom Baker"), data.frame(doctors=5, actors="Peter Davison"))

Peter Davison

because the fifth row of who is

5       5     Peter Davison

and Peter Davison is the value in the second column.


Is is true that this creates a data frame?

who <- data.frame(doctors=c(1, 2, 3), actors=c("William Hartnell", "Patrick Troughton", "Jon Pertwee"))
who <- rbind(who, data.frame(doctors=4, played_by="Tom Baker"))

no

because the column names used do not match.


Is is true that this creates a data frame?

who <- data.frame(doctors=c(1, 2, 3), actors=c("William Hartnell", "Patrick Troughton", "Jon Pertwee"))
who <- cbind(who, start_year=c(1963, 1966, 1970))

yes

because cbind adds a column to a data frame.


Is is true that this creates a data frame?

who <- data.frame(doctors=c(1, 2, 3), actors=c("William Hartnell", "Patrick Troughton", "Jon Pertwee"))
who <- cbind(who, start_year=c(1963, 1966, 1970, 1974))

no

because all columns in a data frame must have the same number of rows.


Is is true that this creates a data frame?

who <- data.frame(doctors=c(1, 2, 3), actors=c("William Hartnell", "Patrick Troughton", "Jon Pertwee"))
who <- cbind(who, start_year=c(1963, 1966, 1970, 1974))

no

because all columns in a data frame must have the same number of rows.


Is is true that this creates a data frame?

who <- data.frame(doctors=c(1, 2, 3), actors=c("William Hartnell", "Patrick Troughton", "Jon Pertwee"), row.names=c("A", "B", "C"))
new_who <- data.frame(start_year=c(1963, 1966, 1970), row.names=c("I", "II", "III")) 
whole_who <- cbind(who, new_who)

yes

because row names do not matter when combining data frames.