Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

period.apply on xts object takes about 50x longer than running it on it's core data and recasting. #278

Open
ckatsulis opened this issue Nov 2, 2018 · 10 comments
Labels
enhancement Enhancement to existing feature help wanted

Comments

@ckatsulis
Copy link

ckatsulis commented Nov 2, 2018

Description

on large xts objects +1MM lines, period.apply is much faster using core data and recasting

Expected behavior

about equal runtimes with and without a numeric index

Minimal, reproducible example

n <- 1e6
ep <- seq(1, n, 3) 
x <- .xts(runif(n, 1, n), 1:n)
system.time(test1 <- period.apply(x, ep, sum)) 
#    user  system elapsed
#  12.637   0.010  12.653
system.time(test2 <- period.apply(coredata(x), ep, sum)) 
#    user  system elapsed
#   0.803   0.000   0.804

Session Info

R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS

Matrix products: default
BLAS: /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.18.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] Rcpp_0.12.19               roll_1.0.7                 forecast_8.4               DEoptim_2.2-4              dygraphs_1.1.1.6           PerformanceAnalytics_1.5.2 FinancialInstrument_1.3.1  quantmod_0.4-13            TTR_0.23-3                
[10] xts_0.11-1                 zoo_1.8-3                 

loaded via a namespace (and not attached):
 [1] uroot_2.0-9          lattice_0.20-35      assertthat_0.2.0     digest_0.6.15        lmtest_0.9-36        mime_0.5             R6_2.2.2             plyr_1.8.4           backports_1.1.2      futile.options_1.0.1 ggplot2_3.0.0       
[12] pillar_1.2.3         rlang_0.3.0          uuid_0.1-2           lazyeval_0.2.1       curl_3.2             misc3d_0.8-4         data.table_1.11.4    fracdiff_1.4-2       R.utils_2.6.0        R.oo_1.22.0          checkmate_1.8.5     
[23] stringr_1.3.1        pander_0.6.2         htmlwidgets_0.9      RCurl_1.95-4.10      munsell_0.5.0        shiny_1.0.5          compiler_3.5.1       httpuv_1.4.4.2       pkgconfig_2.0.1      qmao_1.6.11          urca_1.3-0          
[34] htmltools_0.3.6      nnet_7.3-12          tidyselect_0.2.5     htmlTable_1.9        tibble_1.4.2         quadprog_1.5-5       dplyr_0.7.7          later_0.7.3          bitops_1.0-6         R.methodsS3_1.7.1    grid_3.5.1          
[45] nlme_3.1-137         jsonlite_1.5         xtable_1.8-2         gtable_0.2.0         formatR_1.5          magrittr_1.5         scales_0.5.0         RcppParallel_4.4.1   stringi_1.2.3        promises_1.0.1       tseries_0.10-45     
[56] bindrcpp_0.2.2       timeDate_3043.102    futile.logger_1.4.3  plot3D_1.1.1         lambda.r_1.2.3       tools_3.5.1          RJSONIO_1.3-0        glue_1.3.0           purrr_0.2.5          yaml_2.1.19          colorspace_1.3-2    
[67] knitr_1.20           bindr_0.1.1          profvis_0.3.5       
@joshuaulrich
Copy link
Owner

joshuaulrich commented Nov 3, 2018

period.apply() is essentially a convenience wrapper around sapply(), so it's not particularly performant. There's a faster unexported C-based version that needs some testing before I'm comfortable exporting it and/or replacing period.apply() with it (see 7c29359).

n <- 1e6L
ep <- c(0L, seq(1L, n, 3L))
x <- .xts(runif(n, 1, n), 1:n)
system.time(test1 <- xts:::period_apply(x, ep, sum))
#    user  system elapsed
#   2.104   0.044   2.149
system.time(test2 <- period.apply(coredata(x), ep, sum))
#    user  system elapsed
#   0.705   0.019   0.725
identical(coredata(test1), test2)
# [1] TRUE

I may be able to use some ideas from simplify2array() to make it even faster. But testing... it needs much testing.

@ckatsulis
Copy link
Author

ckatsulis commented Nov 3, 2018 via email

@joshuaulrich joshuaulrich added enhancement Enhancement to existing feature help wanted labels Nov 3, 2018
@harvey131
Copy link
Contributor

I've been able to go faster using data.table and converting the results back to an xts:

library(xts)  
n <- 1e6L  
ep <- c(0L, seq(1L, n, 3L))  

x <- .xts(runif(n, 1, n), 1:n)
system.time(test1 <- xts:::period_apply(x, ep, sum))
   user  system elapsed 
   1.25    0.00    1.25 
system.time(test2 <- period.apply(coredata(x), ep, sum))
   user  system elapsed 
   0.47    0.00    0.46 
identical(coredata(test1), test2)
[1] TRUE

period.sumDT <- function(x, INDEX)
{
  idx <- findInterval(index(x), index(x)[INDEX], left.open = T)
  dt <- data.table::data.table(idx=idx, data=coredata(x)[,1])
  dt <- as.matrix(dt[, sum(data, na.rm=T), by=idx])
  idx <- dt[,1] < length(INDEX)
  setNames( xts(dt[idx,2], index(x)[INDEX][dt[idx,1]+1]), colnames(x))
}
system.time(test3 <- period.sumDT(x, ep ))
   user  system elapsed 
   0.07    0.00    0.08 
all.equal(test1, test3)
[1] TRUE
>   period_apply <- xts:::period_apply  # not exported
>   set.seed(21)
>   x <- .xts(rnorm(1e7), 1:1e7)
>   e <- endpoints(x, "seconds", 5)
>   system.time(y <- period.apply(x, e, sum))  # current version
   user  system elapsed 
  49.17    0.03   49.20 
>   system.time(z <- period_apply(x, e, sum))  # new C version
   user  system elapsed 
   8.14    0.09    8.33 
>   all.equal(y, z)
[1] TRUE

period.sumDT <- function(x, INDEX)
{
  idx <- findInterval(index(x), index(x)[INDEX], left.open = T)
  dt <- data.table::data.table(idx=idx, data=coredata(x)[,1])
  dt <- as.matrix(dt[, sum(data, na.rm=T), by=idx])
  idx <- dt[,1] < length(INDEX)
  setNames( xts(dt[idx,2], index(x)[INDEX][dt[idx,1]+1]), colnames(x))
}
>   system.time(test3 <- period.sumDT(x, e ))
   user  system elapsed 
   0.31    0.11    0.42 
>   all.equal(y, test3)
[1] TRUE

@joshuaulrich
Copy link
Owner

joshuaulrich commented Dec 15, 2018

I appreciate your comment and desire to make xts even faster. However, your code is over-optimized to this specific example.

Here are the issues that are immediately clear to me:

  • period.apply() needs to accept any function, not only sum()
  • The user's function may also return an object with more than one column
  • The function may also assume that each subset is an xts object

I'm also very unlikely to change the existing period.apply() function until we have adequate unit test coverage.


I also just checked my suspicion that data.table optimizes the call to sum() by avoiding any R evaluations and instead doing all the summing at the C level. So the performance improvement your example shows would not generalize to most other functions.

@harvey131
Copy link
Contributor

harvey131 commented Dec 17, 2018

I see show improvement for other functions as well. It seems to be faster in any instance where FUN operates on an argument x that is the coredata of each interval. It could be useful as an alternative implementation of period.sum, period.min/max. I also included an example comparing to period.sum, which does not allow x to have more than one column.

This example would not work if the user needed the x argument to FUN to be an xts object, with an index attribute. It is not documented in period.apply if FUN can assume it gets an xts object, or even that x is an xts. You have an example using letters where x is not an xts object .

> library(xts)
> n <- 1e6L
> x <- .xts(runif(n, 1, n), 1:n)
> e <- endpoints(x, "seconds", 5)
> # not exported
> period_apply <- xts:::period_apply
> # period.apply using datatable
> period.apply.DT <- function(x, INDEX, FUN, ...)
+ {
+   dt <- data.table::as.data.table(x)
+   dt$period <- as.numeric(dt$index[INDEX][findInterval(dt$index, dt$index[INDEX], left.open = T)+1])
+   dt <- as.matrix(dt[, lapply(.SD, FUN, ...), by='period', .SDcols = !'index'])
+   setNames(xts(dt[,-1], structure(dt[,1], class = c("POSIXct", "POSIXt"), tz='')), colnames(x))
+ }
> # period.sum using datatable
> period.sum.DT <- function(x, INDEX, na.rm=F)
+ {
+   dt <- data.table::as.data.table(x)
+   dt$period <- as.numeric(dt$index[INDEX][findInterval(dt$index, dt$index[INDEX], left.open = T)+1])
+   dt <- as.matrix(dt[, lapply(.SD, sum, na.rm=na.rm), by='period', .SDcols = !'index'])
+   setNames(xts(dt[,-1], structure(dt[,1], class = c("POSIXct", "POSIXt"), tz='')), colnames(x))
+ }
> #
> my_check <- function(values) { all(sapply(values[-1], function(x) all.equal(values[[1]], x))) }
> # period.apply with FUN=sum
> microbenchmark::microbenchmark(period.apply(x, e, sum),
+                                period_apply(x, e, sum),
+                                period.apply.DT(x, e, sum),
+                                times=3, check=my_check, unit = 's')
Unit: seconds
                       expr       min        lq      mean    median        uq       max neval
    period.apply(x, e, sum) 4.2666698 4.2768328 4.2851441 4.2869959 4.2943812 4.3017665     3
    period_apply(x, e, sum) 0.7217397 0.7342956 0.7661842 0.7468515 0.7884065 0.8299614     3
 period.apply.DT(x, e, sum) 0.1849304 0.1883792 0.1928879 0.1918279 0.1968667 0.2019055     3
> # period.sum
> microbenchmark::microbenchmark(period.sum(x, e),
+                                period.sum.DT(x, e),
+                                times=3, check=my_check, unit = 's')
Unit: seconds
                expr       min        lq      mean   median        uq       max neval
    period.sum(x, e) 4.2866287 4.4034498 4.4782497 4.520271 4.5740602 4.6278495     3
 period.sum.DT(x, e) 0.1248247 0.1333578 0.1644498 0.141891 0.1842624 0.2266339     3
> # period.apply with FUN=min
> microbenchmark::microbenchmark(period.apply(x, e, min),
+                                period_apply(x, e, min),
+                                period.apply.DT(x, e, min),
+                                times=3, check=my_check, unit = 's')
Unit: seconds
                       expr       min        lq      mean    median        uq       max neval
    period.apply(x, e, min) 4.2764956 4.3273872 4.3512772 4.3782788 4.3886679 4.3990571     3
    period_apply(x, e, min) 0.5536805 0.5877314 0.6010733 0.6217822 0.6247697 0.6277571     3
 period.apply.DT(x, e, min) 0.1632318 0.1833229 0.1949875 0.2034139 0.2108654 0.2183168     3
> # period.apply with arbitrary function
> microbenchmark::microbenchmark(period.apply(x, e, function(x) { sqrt(prod(x)) + min(x) }),
+                                period_apply(x, e, function(x) { sqrt(prod(x)) + min(x) }),
+                                period.apply.DT(x, e, function(x) { sqrt(prod(x)) + min(x) }),
+                                times=3, check=my_check, unit = 's')
Unit: seconds
                                                              expr       min        lq      mean    median        uq       max neval
    period.apply(x, e, function(x) {     sqrt(prod(x)) + min(x) }) 4.5865064 4.6597227 4.7162952 4.7329391 4.7811896 4.8294402     3
    period_apply(x, e, function(x) {     sqrt(prod(x)) + min(x) }) 1.1638887 1.1998162 1.2164511 1.2357438 1.2427324 1.2497210     3
 period.apply.DT(x, e, function(x) {     sqrt(prod(x)) + min(x) }) 0.2706527 0.2719362 0.2981492 0.2732197 0.3118975 0.3505753     3

@joshuaulrich
Copy link
Owner

joshuaulrich commented Dec 19, 2018

Thanks for the notes about period.sum() and others. That slowdown was due to an unnecessary as.matrix() call. The timings are more in xts' favor once those are removed:

> microbenchmark::microbenchmark(period.sum(x, e),
+                                period.sum.DT(x, e),
+                                times=3, check=my_check)
Unit: milliseconds
                expr      min       lq     mean   median       uq      max neval
    period.sum(x, e) 11.41976 11.47986 17.24279 11.53996 20.15431 28.76866     3
 period.sum.DT(x, e) 59.08953 61.27357 64.23368 63.45761 66.80575 70.15389     3

This example would not work if the user needed the x argument to FUN to be an xts object, with an index attribute. It is not documented in period.apply if FUN can assume it gets an xts object, or even that x is an xts. You have an example using letters where x is not an xts object.

You're correct that it isn't documented that FUN may expect x to be an xts object. But period.apply() calls try.xts() which attempts to convert x to xts, if possible. Practically, it is very likely that FUN expects an object similar to the object passed to period.apply(). Unfortunately, in your faster function that is only the case when the object passed to period.apply() is a data.table.


I'm very surprised at the data.table performance for the arbitrary function. It is faster than simply calling the function 200,000 times (the length of endpoints). I don't feel good about that.

d <- x[1:4]
f <- function(x) { sqrt(prod(x)) + min(x) }
system.time(for(i in seq_len(length(e))) f(d))
#    user  system elapsed 
#   0.576   0.000   0.577 
system.time(period.apply.DT(x, e, f))
#    user  system elapsed 
#    0.34    0.00    0.34 

@ckatsulis
Copy link
Author

ckatsulis commented Dec 19, 2018 via email

joshuaulrich added a commit that referenced this issue Dec 20, 2018
The period.XXX() functions (sum, prod, min, max) only work on
univariate series, so the as.matrix() call is not necessary. It makes
the functions considerably slower, because it converts the xts index
into row names for the matrix.

See #278.
@AndreMikulec
Copy link

I can not repeat this test. What is 'd?'

system.time(for(i in seq_len(length(e))) f(d))

@joshuaulrich
Copy link
Owner

@AndreMikulec d is the first 4 rows of x. Sorry for the confusion.

d <- x[1:4]
system.time(for(i in seq_len(length(e))) f(d))
#    user  system elapsed 
#   0.543   0.000   0.544
system.time(period.apply.DT(x, e, f))
#    user  system elapsed 
#    0.31    0.00    0.31

@AndreMikulec
Copy link

I do not know.I am wild guessing.
Possibly multithreading is occuring.

From
https://github.com/Rdatatable/data.table/blob/d3ccd8139d3d592201e0f085c5f37a4fc5a426e3/NEWS.0.md

Contents

Following gsum and gmean, now gmin and gmax from GForce are also implemented. 
Closes part of #523. Benchmarks are also provided. 
R DT[, list(sum(x), min(y), max(z), .N), by=...] # runs by default using GForce

From
https://github.com/Rdatatable/data.table/blob/master/src/gsumm.c

Contents

. . . 
SEXP gforce(SEXP env, SEXP jsub, SEXP o, SEXP f, SEXP l, SEXP irowsArg)
. . . 
#pragma omp parallel for num_threads(getDTthreads())
. . . 
SEXP ans = PROTECT( eval(jsub, env) );

Benchmarks here are interesting.
From
Rdatatable/data.table#523

@joshuaulrich joshuaulrich added this to the 0.12-0 milestone Jul 4, 2019
@joshuaulrich joshuaulrich modified the milestones: 0.12-0, 0.12-1 Jul 12, 2020
@joshuaulrich joshuaulrich modified the milestones: 0.12.1, 0.12.2 Sep 9, 2020
@joshuaulrich joshuaulrich modified the milestones: 0.12.2, 0.12.3 Oct 12, 2022
@joshuaulrich joshuaulrich modified the milestones: 0.13.0, 0.13.1 Jan 16, 2023
@joshuaulrich joshuaulrich removed this from the 0.13.1 milestone Apr 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement to existing feature help wanted
Projects
None yet
Development

No branches or pull requests

4 participants