challenge: optimize unlisting [easy]

optimize umich
optimize cohort
optimize mentor
lapply
optimize events
r reduce function
umich portal
r reduce purrr

Because SO is a bit slow lately, I'm posting an easy question. I would appreciate it if big fishes stayed on the bench for this one and give rookies a chance to respond.

Sometimes we have objects that have a ridiculous amount of large list elements (vectors). How would you "unlist" this object into a single vector. Show proof that your method is faster than unlist().

If you don't need names and your list is one level deep, then if you can beat

.Internal(unlist(your_list, FALSE, FALSE))

I will vote up everything you do on SO for the next 1 year!!!

[Update: if non-unique names are needed and the list is not recursive, here is a version which improves over the unlist 100 times

 myunlist <- function(l){
    names <- names(l)
    vec <- unlist(l, F, F)
    reps <- unlist(lapply(l, length), F, F)
    names(vec) <- rep(names, reps)
    vec
    }

 myunlist(list(a=1:3, b=2))
 a a a b 
 1 2 3 2 

 > tl <- list(a = 1:20000, b = 1:5000, c = 2:30)
 > system.time(for(i in 1:200) unlist(tl))
 user  system elapsed 
 22.97    0.00   23.00 

 > system.time(for(i in 1:200) myunlist(tl))
 user  system elapsed 
 0.2     0.0     0.2 

 > system.time(for(i in 1:200) unlist(tl, F, F))
 user  system elapsed 
 0.02    0.00    0.02 

]

[Update2: Responce to challenge Nr3 from Richie Cotton.

bigList3 <- replicate(500, rnorm(1e3), simplify = F)

unlist_vit <- function(l){
    names(l) <- NULL
    do.call(c, l)
    }

library(rbenchmark)

benchmark(unlist = unlist(bigList3, FALSE, FALSE),
          rjc    = unlist_rjc(bigList3),
          vit    = unlist_vit(bigList3),
          order  = "elapsed",
          replications = 100,
          columns = c("test", "relative", "elapsed")
          )

    test  relative elapsed
1 unlist   1.0000    2.06
3    vit   1.4369    2.96
2    rjc   3.5146    7.24

]

PS: I assume a "big fish" is the one with more reputation than you. So I am pretty much small here :).

challenge: optimize unlisting [easy], Code organisation teaches you how to organise your code to make optimisation as easy, and bug free, as possible. Otherwise, the challenge is describing your bottleneck in a way that helps you find unlist(x, use. names = FALSE) is much faster than unlist(​x) . 2 challenge: optimize unlisting [easy] Dec 8 '18 1 Function that returns a named list with formula (~) tilde and symbols Jul 3 '19 1 Making new expressions within a function using existing quosures (dplyr programming) Feb 9 '18

A non-unlist() solution would have to be pretty darned fast to beat unlist() would it not? Here it takes less than two second to unlist a list with 2000 numeric vectors of length 100,000 each.

> bigList2 <- as.list(data.frame(matrix(rep(rnorm(1000000), times = 200), 
+                                       ncol = 2000)))
> print(object.size(bigList2), units = "Gb")
1.5 Gb
> system.time(foo <- unlist(bigList2, use.names = FALSE))
   user  system elapsed 
  1.897   0.000   2.019

With bigList2 and foo in my workspace, R is using ~9Gb of my available memory. The key is use.names = FALSE. Without it unlist() is painfully slow. Exactly how slow I'm still waiting to find out...

We can speed this up a little bit more by setting recursive = FALSE and then we have effectively the same as VitoshKa's answer (two representative timings):

> system.time(foo <- unlist(bigList2, recursive = FALSE, use.names = FALSE))
   user  system elapsed 
  1.379   0.001   1.416
> system.time(foo <- .Internal(unlist(bigList2, FALSE, FALSE)))
   user  system elapsed 
  1.335   0.000   1.344

... finally the use.names = TRUE version finished...:

> system.time(foo <- unlist(bigList2, use = TRUE))
    user   system  elapsed 
2307.839   10.978 2335.815

and it was consuming all my systems 16Gb of RAM so I gave up at that point...

challenge: optimize unlisting [easy], Если вам не нужны имена и ваш список на один уровень, то если вы можете бить .Internal(unlist(your_list, FALSE, FALSE)). Я буду голосовать все, что вы  11 Big SEO Challenges You’ll Face in Your Career. From managing client expectations to balancing task load, learn 11 SEO challenges that you might face in your career and how to overcome them.

c() has the logical argument recursive which will recursively unlist a vector when set to TRUE (default is obviously FALSE).

l <- replicate(500, rnorm(1e3), simplify = F)

microbenchmark::microbenchmark(
  unlist = unlist(l, FALSE, FALSE),
  c = c(l, recursive = TRUE, use.names = FALSE)
)

# Unit: milliseconds
# expr      min       lq     mean   median       uq      max neval
# unlist 3.083424 3.121067 4.662491 3.172401 3.985668 27.35040   100
#      c 3.084890 3.133779 4.090520 3.201246 3.920646 33.22832   100

optiMize, Our primary program is the optiMize Challenge. It's an incubator for students to develop their projects. Projects can receive up to $20,000 — we awarded a total​  In this quick step by step tutorial, JoAnne will show you the three simple steps to unlisting a video on youtube. Similar steps can be taken to creat and or unlist playlists on youtube.

As a medium-size fish, I'm jumping in with a first-attempt solution that gives a benchmark for little fishes to beat. It's about 3 times slower than unlist.

I'm using a smaller version of ucfagls's test list. (Since it fits in memory better.)

bigList3 <- as.list(data.frame(matrix(rep(rnorm(1e5), times = 200), ncol = 2000)))

The basic idea is to create one long vector to store the answer, then loop over list items copying values from the list.

unlist_rjc <- function(l)
{
  lengths <- vapply(l, length, FUN.VALUE = numeric(1), USE.NAMES = FALSE)
  total_len <- sum(lengths)
  end_index <- cumsum(lengths)
  start_index <- 1 + c(0, end_index)
  v <- numeric(total_len)
  for(i in seq_along(l))
  {
    v[start_index[i]:end_index[i]] <- l[[i]]
  }
  v
}

t1 <- system.time(for(i in 1:10) unlist(bigList2, FALSE, FALSE))
t2 <- system.time(for(i in 1:10) unlist_rjc(bigList2))
t2["user.self"] / t1["user.self"]  # 3.08

Challenges for little fishes: 1. Can you extend it to deal with other types than numeric? 2. Can you get it to work with recursion (nested lists)? 3. Can you make it faster?

I'll upvote anyone with less points than me whose answer meets one or more of these mini-challenges.

Functionals · Advanced R., Here's a simple functional: it calls the function provided as input with 1000 random (I'm using unlist() to convert the output from a list to a vector to make it more compact. One challenge with using the base functionals is that they have grown In R, it's common to work with the negative since optimise() defaults to finding  UPDATE: Learn how to optimize properly Your YouTube videos to get your more views. I will show you the elements that are the most important for massive growth by triggering the algorithm.

Profiling and benchmarking · Advanced R., It's easy to get caught up in trying to remove all bottlenecks. Code organisation teaches you how to organise your code to make optimisation as easy, and bug free, as possible. Otherwise, the challenge is describing your bottleneck in a way that helps you find unlist(x, use.names = FALSE) is much faster than unlist(​x) . 10 Easy Ways To Optimize Your Music Practice : Deceptive Cadence Spruce up the woodshed: with the new school year upon us, here are some easy ways to help you maximize music practice time

How to save $1000 in a Month (+ how to negotiate bills), You're far more likely to complete the Challenge if you have a specific end goal in mind. This flowchart breaks down the simple steps to optimize your phone bill and Call your current cell phone company to find cheaper, unlisted plans. What ClutterBug Are You? Discover your organizing personality style and find FREE Organizing Tips and Free Printables. Quick and Easy Tips for a Clean and Organized Home.

How To Make A YouTube Playlist, Creating a playlist is a simple way to get more eyes on your videos — and keep to experience the benefits of YouTube playlists, you need to optimize each playlist. In this popular Yoga Challenge playlist, the instructor welcomes viewers and you have the option to set your playlists as either public, private, or unlisted. YouTube is an excellent platform for personal use such as sharing videos with friends & family, It’s not intended to sell the products and the services, It is a popular video-sharing website that open to the public, YouTube videos will appear in both YouTube & Google search engines, You can get your brand that gets to millions of views, you can introduce your company & the benefits of your

Comments
  • Everyone is a "big fish" here ;). You are running the risk of not getting any answer.
  • How big is big? Are we talking salmon, marlin or whale shark?
  • What's "a ridiculous amount of large list elements" mean - vectors of length 1,000,000 or longer? How many list elements is a "ridiculous" amount?
  • Names of the vectors must be preserved? If so must be unique? List is recursive? The default unlist does all of that.
  • Let say several 10.000 list elements, but I will let imagination run wild. In general, the number of elements should be large enough to show any difference in speed performance but within the memory limit. Names can be dumped. Let's assume list has
  • +1 It is probably not significant, but in my test your version was a bit faster than @ucfagls'. Yet the biggest speedup is gathered from use.names=F.
  • I understood from Roman's question that his desire was to replace the build-in "unlist" with something smarter. As far as I am concerned this is not possible when names are not required.
  • Now that's an offer a person can't refuse! If I write 6 items a day I may have marginal chance of catching Dirk. FWIW, I consider big fishes people with several 1000 points.
  • On my system there was nothing to choose between unlist(bigList2, recursive = FALSE, use.names = FALSE), and .Internal(unlist(bigList2, FALSE, FALSE). The overhead in unlist() before the .Internal call is negligible. Unless I'm really, really sure I have the correct object, I try to stay away from .Internal just in case, as you can crash R if you get something wrong or provide something the function wasn't expecting.
  • @ucfagis , that's right for big vectors and small number of iterations. But if your list consists only small vectors and you run long simulations, the improvement of .Internal could be as big as double!!!
  • Medium-size? According to ranking of top users in R tag you are 10th answerers in last 30 days, and 7th all time ;)
  • Wow. Hadn't realised that. Yay me. Anyway, my offer still stands. Improve/beat my answer -> get an upvote.
  • on my machine your thing is about 5 times slower.
  • After clean restart it's only 3.5 times slower. I updated my answer for your challenge nr3. Waiting for upvote:)