Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (17.6k points)

I have a R dataset with key value strings which looks like below:

quest<-data.frame(city=c("Atlanta","New York","Atlanta","Tampa"), key_value=c("rev=63;qty=1;zip=45987","rev=10.60|34;qty=1|2;zip=12686|12694","rev=12;qty=1;zip=74268","rev=3|24|8;qty=1|6|3;zip=33684|36842|30254"))

which translates to:

      city                                  key_value

1  Atlanta                     rev=63;qty=1;zip=45987

2 New York       rev=10.60|34;qty=1|2;zip=12686|12694

3  Atlanta                     rev=12;qty=1;zip=74268

4    Tampa rev=3|24|8;qty=1|6|3;zip=33684|36842|30254

Based on the above dataframe how can I create a new data frame which looks like below :

      city  rev qty   zip

1  Atlanta 63.0   1 45987

2 New York 10.6   1 12686

3 New York 34.0   2 12686

4  Atlanta 12.0   1 74268

5    Tampa  3.0   1 33684

6    Tampa 24.0   6 33684

7    Tampa  8.0   3 33684

"|" is the common delimiter which will determine the number of rows to be created.

1 Answer

0 votes
by (41.4k points)

Follow the steps for below code:

1.Firstly, split by then by = and |, and after that by using the first part as the name, combine into a matrix. 

2.Then repeat the rows of the original data frame and then combine. 

Note that the columns are left as character.

 

a <- strsplit(as.character(quest$key_value), ";")

a <- lapply(a, function(x) {

    x <- do.call(cbind, strsplit(x, "[=|]"))

    colnames(x) <- x[1,]

    x[-1,,drop=FALSE]

})

b <- quest[rep(seq_along(a), sapply(a, nrow)), colnames(quest) != "key_value", drop=FALSE]

out <- cbind(b, do.call(rbind, a), stringsAsFactors=FALSE)

rownames(out) <- NULL

output:

##       city   rev qty   zip

## 1  Atlanta    63   1 45987

## 2 New York 10.60   1 12686

## 3 New York    34   2 12694

## 4  Atlanta    12   1 74268

## 5    Tampa     3   1 33684

## 6    Tampa    24   6 36842

## 7    Tampa     8   3 30254

Related questions

Browse Categories

...