Do note that the new plot.xts() includes breaking changes to the original (and rather limited) plot.xts(). However, we believe the new functionality more than compensates for the potential one-time inconvenience. And I will no longer have to tell people that I use plot.zoo() on xts objects!

This release also includes more bug fixes than you can shake a stick at. We squashed several bugs that could have crashed your R session. We also fixed some (always pesky and tricky) timezone issues. We've also done more sanity checking (e.g. for NA in the index), and provide more informative errors when things aren't right. And last, but not least, unit tests are running again!

I'm sure you were hoping to see some examples of the new plot.xts() functionality. Rather than clutter up this blog post with code, check out the basic examples, and the panel functionality examples that Ross Bennett created.

I'm looking forward to your questions and feedback! If you have a question, please ask on Stack Overflow and use the [r] and [xts] tags. Or you can send an email to the R-SIG-Finance

mailing list (you must subscribe to post). Open an issue on GitHub if you find a bug or want to request a feature, but please read the contributing guide first! ]]>

You can explore the first chapter for free, so be sure to check it out!

A wealth of financial and economic data are available online. Learn how getSymbols() and Quandl() make it easy to access data from a variety of sources.

You've learned how to import data from online sources, now it's time to see how to extract columns from the imported data. After you've learned how to extract columns from a single object, you will explore how to import, transform, and extract data from multiple instruments.

Learn how to simplify and streamline your workflow by taking advantage of the ability to customize default arguments to getSymbols(). You will see how to customize defaults by data source, and then how to customize defaults by symbol. You will also learn how to handle problematic instrument symbols

You've learned how to import, extract, and transform data from multiple data sources. You often have to manipulate data from different sources in order to combine them into a single data set. First, you will learn how to convert sparse, irregular data into a regular series. Then you will review how to aggregate dense data to a lower frequency. Finally, you will learn how to handle issues with intra-day data.

You've learned the core workflow of importing and manipulating financial data. Now you will see how to import data from text files of various formats. Then you will learn how to check data for weirdness and handle missing values. Finally, you will learn how to adjust stock prices for splits and dividends. ]]>

Unfortunately, the URL wasn't the only thing that changed. The actual data available for download changed as well.

The most noticeable difference is that the adjusted close column is no longer dividend-adjusted (i.e. it's only split-adjusted). Also, only the close price is unadjusted; the open, high, and low are split-adjusted.

There also appear to be issues with the adjusted prices in some instruments. For example, users reported issues with split data for XLF and SPXL in GitHub issue #160. For XLF, there a split

Another change is that the downloaded data may contain rows where all the values are "null". These appear on the website as "0". This is a major issue for some instruments. Take XLU for example; 188 of the 624 days of data are missing between 2014-12-04 and 2017-05-26 (ouch!). You can see this is even true on the Yahoo! Finance historical price page for XLU.

If these changes have made you look for a new data provider, see my post: Yahoo! Finance Alternatives. ]]>

The most noticeable difference is that the adjusted close column is now only split-adjusted, whereas it used to be split- and dividend-adjusted. Another oddity is that only the close prices is unadjusted (strangely, the open, high, and low are split-adjusted).

All these issues can be dealt with using tools that are currently available. For example, you can unadjust the open, high, and low prices using the ratio of close to adjusted close prices. And you can adjust for both splits and dividends using quantmod::adjustOHLC().

Unfortunately, there also appear to be issues with data quality. Some instruments have rows where all the prices and volume are zeros (e.g. XLU). The adjusted close in some instruments is incorrect because of missing split events, or double-counting splits and special dividends.

So, what are your alternatives? If you're just tinkering, you can try other free data sources like Google Finance or Quandl. Note that Google Finance data is already split-adjusted, so you might need to adjust for dividends, or un-adjust for splits, depending on your needs. Quandl has a wiki of end-of-day stock prices curated by the community. You only need a free account to access the data.

If you're using the data to make actual investment decisions, you should really be using a professional data provider. At the very least, you get someone to yell at when the data have errors. :) First, you should check if your broker provides the historical data you need (e.g. Interactive Brokers provides historical and real-time data to account-holders).

If your broker doesn't provide historical data, here are a few providers you may want to consider:

- Provide limited historical data for free

- For a one-time fee:

- $20-$50 for 10 years of daily data

- $40-$100 for 20 years of daily data

- Massive historical equity database

- $600 annually for 30 years of daily data

- Ability to adjust for splits and dividends

- Mainly a real-time data provider, but also has historical data

- Features

- Pricing, starts at $78/month

Leave a comment if you know of another end-of-day data provider that I didn't list!

All three providers made breaking changes to their URLs/interfaces.

getSymbols.google also got some love. It now honors all arguments set via setSymbolLookup (#138), and it correctly parses the date column in non-English locales (#140).

There's a handy new argument to getDividends: split.adjust. It allows you to request dividends unadjusted for splits (#128). Yahoo provides split-adjusted dividends, so you previously had to manually unadjust them for splits if you wanted the original raw values. To import the raw unadjusted dividends, just call:

rawDiv <- getDividends("IBM", split.adjust = FALSE)

Note that the default is split.adjust = TRUE to maintain backward-compatibility. ]]>

A quantmod user asked an interesting question on StackOverflow: Looping viewFinancials from quantmod. Basically, they wanted to create a `data.frame`

that contained financial statement data for several companies for several years. I answered their question, and thought others might find the function I wrote useful… hence, this post!

I called the function `stackFinancials()`

because it would use `getFinancials()`

and `viewFinancials()`

to pull financial statement data for multiple symbols, and stack them together in long form. I chose a long data format because I don’t know whether the output of `viewFinancials()`

always has the same number of rows and columns for a given `type`

and `period`

. The long format makes it easy to put all the data in one object.

```
stackFinancials <-
```

function(symbols, type = c("BS", "IS", "CF"), period = c("A", "Q")) {

# Ensure the type and period arguments match viewFinancials

type <- match.arg(toupper(type[1]), c("BS", "IS", "CF"))

period <- match.arg(toupper(period[1]), c("A", "Q"))

# Simple function to get financials for one symbol

getOne <- function(symbol, type, period) {

gf <- getFinancials(symbol, auto.assign = FALSE)

vf <- viewFinancials(gf, type = type, period = period)

# Put viewFinancials output into a data.frame

df <- data.frame(vf, line.item = rownames(vf), type = type,

period = period, symbol = symbol,

stringsAsFactors = FALSE, check.names = FALSE)

# Reshape data.frame into long format

long <- reshape(df, direction="long", varying=seq(ncol(vf)),

v.names="value", idvar="line.item",

times=colnames(vf))

# Reset row.names to "automatic"

rownames(long) <- NULL

# Return data

long

}

# Loop over all symbols

allData <- lapply(symbols, getOne, type = type, period = period)

# rbind() all into one data.frame

do.call(rbind, allData)

}

Here’s a simple example of how to use `stackFinancials()`

to pull the quarterly (`period = "Q"`

) income statements (`type = "IS"`

) for General Electric and Apple:

```
library(quantmod)
```

Data <- stackFinancials(c("GE", "AAPL"), type = "IS", period = "Q")

head(Data, 4)

```
## line.item type period symbol time value
```

## 1 Revenue IS Q GE 2016-12-31 33088

## 2 Other Revenue, Total IS Q GE 2016-12-31 NA

## 3 Total Revenue IS Q GE 2016-12-31 33088

## 4 Cost of Revenue, Total IS Q GE 2016-12-31 24775

Now that we have the output in `Data`

, let’s do something with it. You could simply subset `Data`

to extract the components you want. For example, if you wanted to look at Apple’s quarterly revenue, you could subset `Data`

where `symbol == "AAPL"`

and `line.item == "Total Revenue"`

. But if you’re going to slicing-and-dicing a lot, it can often help to write a general function to simplify things. So I wrote `extractLineItem()`

. It takes the output of `stackFinancials()`

and a regular expression of the line item you want, and it returns an xts object that contains the given line items for all symbols in the data.

```
extractLineItem <- function(stackedFinancials, line.item) {
```

if (missing(stackedFinancials) || missing(line.item)) {

stop("You must provide output from stackFinancials(),",

"and the line.item to extract")

}

# Select line items matching user input

match.rows <- grepl(line.item, Data$line.item, ignore.case = TRUE)

sfSubset <- Data[match.rows,]

getItem <- function(x) {

# Create xts object

output <- xts(x$value, as.yearmon(x$time))

# Ensure column names are syntactically valid

valid.names <- make.names(paste(x$symbol[1], x$line.item[1]))

# Remove repeating periods

colnames(output) <- gsub("\\.+", "\\.", valid.names)

output

}

# Split subset by line.item and symbol

symbol.item <- split(sfSubset, sfSubset[, c("symbol", "line.item")])

# Apply getItem() to each chunk, and merge into one object

do.call(merge, lapply(symbol.item, getItem))

}

Let’s use `extractLineItem()`

to compare total revenue for GE and AAPL.

```
totalRevenue <- extractLineItem(Data, "total revenue")
```

totalRevenue

```
## AAPL.Total.Revenue GE.Total.Revenue
```

## Dec 2015 75872 24654

## Mar 2016 50557 27845

## Jun 2016 42358 61339

## Sep 2016 46852 90605

## Dec 2016 78351 33088

`plot(totalRevenue, main = "Quarterly Total Revenue, AAPL (black) vs GE (red)")`

You could also combine multiple calls to `extractLineItem()`

to calculate ratios not included in the output from `viewFinancials()`

. For example, you could divide operating income by total revenue to calculate operating margin.

```
operatingIncome <- extractLineItem(Data, "operating income")
```

operatingIncome

```
## AAPL.Operating.Income GE.Operating.Income
```

## Dec 2015 24171 2863

## Mar 2016 13987 545

## Jun 2016 10105 4736

## Sep 2016 11761 6138

## Dec 2016 23359 2892

`plot(operatingIncome / totalRevenue, main = "Quarterly Operating Margin, AAPL (black) vs GE (red)")`

R/Finance 2017: Applied Finance with R

May 19 and 20, 2017

University of Illinois at Chicago

The ninth annual R/Finance conference for applied finance using R will be held on May 19 and 20, 2017 in Chicago, IL, USA at the University of Illinois at Chicago. The conference will cover topics including portfolio management, time series analysis, advanced risk tools, high-performance computing, market microstructure, and econometrics. All will be discussed within the context of using R as a primary tool for financial risk management, portfolio construction, and trading.

Over the past eight years, R/Finance has included attendees from around the world. It has featured presentations from prominent academics and practitioners, and we anticipate another exciting line-up for 2017.

We invite you to submit complete papers in pdf format for consideration. We will also consider one-page abstracts (in txt or pdf format) although more complete papers are preferred. We welcome submissions for both full talks and abbreviated "lightning talks." Both academic and practitioner proposals related to R are encouraged.

All slides will be made publicly available at conference time. Presenters are strongly encouraged to provide working R code to accompany the slides. Data sets should also be made public for the purposes of reproducibility (though we realize this may be limited due to contracts with data vendors). Preference may be given to presenters who have released R packages.

Financial assistance for travel and accommodation may be available to presenters, however requests must be made at the time of submission. Assistance will be granted at the discretion of the conference committee.

Please submit proposals online at http://go.uic.edu/rfinsubmit. Submissions will be reviewed and accepted on a rolling basis with a final deadline of February 28, 2017. Submitters will be notified via email by March 31, 2017 of acceptance, presentation length, and financial assistance (if requested).

Additional details will be announced via the conference website www.RinFinance.com as they become available. Information on previous years' presenters and their presentations are also at the conference website. We will make a separate announcement when registration opens.

For the program committee:

Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson, Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich

Oanda changed their URL format from http to https, and getSymbols.oanda did not follow the redirect. Yahoo Finance changed the HTML for displaying options data, which broke getOptionChain.yahoo. The fix downloads JSON instead of scraping HTML, so hopefully it will be less likely to break. For more information, see the links to the GitHub issues above.

I added documentation for getPrice (#77), and removed the unused unsetSymbolLookup function and corresponding documentation (#115).

]]>

Subject: Data Mining Tutorial, R/Finance course series, and more!I'm excited to announce that I'm working on a course for this new series! It will provide an introduction to importing and managing financial data.

R/Finance - A new course series in the works

We are working on a whole new course series on applied finance using R. This new series will cover topics such as time series (David S. Matteson), portfolio analysis (Kris Boudt), the xts and zoo packages (Jeffrey Ryan), and much more. Start our first course Intro to Credit Risk Modeling in R today.

If you've ever done anything with financial or economic time series, you know the data come in various shapes, sizes, and periodicities. Getting the data into R can be stressful and time-consuming, especially when you need to merge data from several different sources into one data set. This course will cover importing data from local files as well as from internet sources.

The tentative course outline is below. I'd really appreciate your feedback on what should be included in this introductory course! So let me know if I've omitted something, or if you think any of the topics are too advanced.

- Introduction and downloading data
- getSymbols design overview, Quandl
- Finding and downloading data from internet sources
- E.g. getSymbols.yahoo, getSymbols.FRED, Quandl
- Loading and transforming multiple instruments
- Checking for errors (i.e. summary stats, visualizing)
- Managing data from multiple sources
- Setting per-instrument sources and default arguments
- setSymbolLookup, saveSymbolLookup, loadSymbolLookup, setDefaults
- Handling instruments names that clash or are not valid R object names
- Aligning data with different periodicities
- Making irregular data regular
- Aggregating to lowest frequency
- Combining monthly with daily
- Combining daily with intraday
- Storing and updating data
- Creating an initial RData-backed storage
- Adjusting financial time-series
- Handling errors during update process

Note that registration fees

The conference will take place on May 20 and 21, at UIC in Chicago. Building on the success of the previous conferences in 2009-2015, we expect more than 250 attendees from around the world. R users from industry, academia, and government will joining 50 presenters covering all areas of finance with R.

We are very excited about the four keynote presentations given by Patrick Burns, Frank Diebold, Tarek Eldin, and Rishi Narang. The conference agenda (currently) includes 17 full presentations and 33 shorter "lightning talks". As in previous years, several (optional) pre-conference seminars are offered on Friday morning.

There is also an (optional) conference dinner at The Riverside Room and Gallery at Trump. Situated directly on the hotel's new River Walk, it is a perfect venue to continue conversations while dining and drinking.

We would to thank our 2016 Sponsors for the continued support enabling us to host such an exciting conference:

UIC Liautaud Master of Science in Finance

Microsoft

MS-Computational Finance and Risk Management at University of Washington

Charles Schwab

Hull Investments

Interactive Brokers

OneMarketData

RStudio

On behalf of the committee and sponsors, we look forward to seeing you in Chicago!

Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson, Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich

]]>

I completely agree with the first point. I'm not sure Mike considers the output of SharpeRatio.annualized with geometric=TRUE to be suspect (he doesn't elaborate). The overnightRets are calculated as arithmetic returns, so it's proper to aggregate them using geometric chaining (i.e. multiplication).

- The R backtest assumes fractional shares. This means that equity is fully invested at each new position. This is important because it affects drawdown calculations.
- When calculating the Sharpe ratio, the “geometric = FALSE” option must be used otherwise the result may not be correct. It took some time to figure that out.
- The profit factor result in R does not reconcile with results from other platforms or even from excel. PF in R is shown as 1.23 but the correct value is 1.17. Actually, the profit factor is calculated on a per share basis in R, although returns are geometric.

I also agree with the third point, because the R code used to calculate profit factor is wrong. My main impetus to write this post was to provide a corrected profit factor calculation. The calculation (with slightly modified syntax) in Mike's post is:

require(quantmod)

getSymbols('SPY', from = '1900-01-01')

SPY <- adjustOHLC(SPY, use.Adjusted=TRUE)

overnightRets <- na.omit(Op(SPY)/lag(Cl(SPY)) - 1)

posRet <- overnightRets > 0

profitFactor <- -sum(overnightRets[posRet])/sum(overnightRets[!posRet])

Note that profit factor in the code above is calculated by summing positive and negative *returns*, when it should be calculated using positive and negative *P&L*. In order to do that, we need to calculate the equity curve and then take its first difference to get P&L. The corrected calculation is below, and it provides the correct result Mike expected.

grossPnL <- diff(grossEquity)

grossProfit <- sum(grossPnL[grossPnL > 0])

grossLoss <- sum(grossPnL[grossPnL < 0])

profitFactor <- grossProfit / abs(grossLoss)

I'd also like to respond to Mike's comment:

Since in the past I have identified serious flaws in commercially available backtesting platforms, I would not be surprised if some of the R libraries have some flaws.

I'm certain all of the backtesting R packages have flaws/bugs. All software has bugs because all software is written by fallible humans. One nice thing about (most) R packages is that they're open source, which means anyone/everyone can check the code for bugs, and fix any bugs that are found. With closed-source software, commercial or not, you depend on the vendor to deliver a patched version at their discretion and in their timing.

Now, I'm not making an argument that open source software is inherently better. I simply wanted to point out this one difference. As much as I love open source software, there are times where commercial vendor-supported software presents a more appealing set of tradeoffs than using open source software. Each situation is different.

]]>
James Toll provided a patch to the volatility function that uses a zero mean (instead of the sample mean) in close-to-close volatility. The other big change is that moving average functions no longer return objects with column names based on the input object column names. There are many other bug fixes (see the CHANGES file in the package).

The biggest changes in quantmod were to fix getSymbols.MySQL to use the correct dbConnect call based on changes made in RMySQL_0.10 and to fix getSymbols.FRED to use https:// instead of http:// when downloading FRED data. getSymbols.csv also got some much-needed love.

I'd also like to mention that development has moved to GitHub for both TTR and quantmod.

This new engine improves the functionality, modularity, and flexibility of

The main objective was to provide functionality similar to

- Basic time series plots with sensible defaults
- Plotting xts objects by column "automagically" as separate panels
- Small multiples with multiple pages
- "Layout-safe" so multiple specifications/panels can be charted in a single device
- Easily add data to an existing plot or add panels similar to
**quantmod::add*** - Event lines

The new version of

Note that the new

The conference will take place on May 29 and 30, at UIC in Chicago. Building on the success of the previous conferences in 2009-2014, we expect more than 250 attendees from around the world. R users from industry, academia, and government will joining 30+ presenters covering all areas of finance with R.

We are very excited about the four keynote presentations given by Emanuel Derman, Louis Marascio, Alexander McNeil, and Rishi Narang. The main agenda (currently) includes 18 full presentations and 19 shorter "lightning talks". As in previous years, several (optional) pre-conference seminars are offered on Friday morning.

There is also an (optional) conference dinner that will once-again be held at The Terrace at Trump Hotel. Overlooking the Chicago river and skyline, it is a perfect venue to continue conversations while dining and drinking.

We would to thank our 2015 sponsors for the continued support enabling us to host such an exciting conference:

International Center for Futures and Derivatives at UIC

Revolution Analytics

MS-Computational Finance at University of Washington

OneMarketData

Ketchum Trading

RStudio

SYMMYS

On behalf of the committee and sponsors, we look forward to seeing you in Chicago!

For the program committee:

Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson, Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich

]]>

Changes to the Yahoo Finance and Oanda websites broke the getOptionChain.yahoo and getSymbols.oanda functions, respectively. I didn’t use getOptionChain.yahoo much, so I’m not certain I restored all the prior functionality. Let me know if there’s something I missed. I’d be glad to add a test case for that, or to add a test you’ve written.

The getSymbols.yahooj function is a major enhancement provided by Wouter Thielen. It allows quantmod users to pull stock data from Yahoo Finance Japan.

Japanese ticker symbols usually start with a number and it is cumbersome to use variable names that start with a number in the R environment, so the string "YJ" will be prepended to each of the Symbols. I recommend using setSymbolLookup to prepend the ticker symbols with “YJ” yourself, so you can just use the main getSymbols function.

For example, if you want to pull Sony data, you would run:

require(quantmod)

setSymbolLookup(YJ6758.T='yahooj')

getSymbols('YJ6758.T')

The full list of supported data sources for quantmod is now: Yahoo Finance-US, Yahoo Finance-Japan, Google Finance, csv, RData (including rds and rda), FRED, SQLite, MySQL, and Oanda.

Contributions to add support for additional data sources are welcomed. The existing getSymbols functions are good templates to start from.

The full list of supported data sources for quantmod is now: Yahoo Finance-US, Yahoo Finance-Japan, Google Finance, csv, RData (including rds and rda), FRED, SQLite, MySQL, and Oanda.

Contributions to add support for additional data sources are welcomed. The existing getSymbols functions are good templates to start from.

]]>

If you’re interested in participating as a student or a mentor, there's an overview of the GSoC program on The R Project GSoC 2015 Wiki. The wiki also includes a timeline and links to prior year's projects.

Several mentors from various backgrounds have already proposed projects for students to work on this summer. Mentors have until March 9th to submit projects they would be willing to support, and student applications begin on March 16th. ]]>

There are also several bug fixes. A few worth noting are:

]]>

R/Finance 2015: Applied Finance with R

May 29 and 30, 2015

University of Illinois at Chicago

The seventh annual R/Finance conference for applied finance using R will be held on May 29 and 30, 2015 in Chicago, IL, USA at the University of Illinois at Chicago. The conference will cover topics including portfolio management, time series analysis, advanced risk tools, high-performance computing, market microstructure, and econometrics. All will be discussed within the context of using R as a primary tool for financial risk management, portfolio construction, and trading.

Over the past six years, R/Finance has included attendees from around the world. It has featured presentations from prominent academics and practitioners, and we anticipate another exciting line-up for 2015. This year will include invited keynote presentations by Emanuel Derman, Louis Marascio, Alexander McNeil, and Rishi Narang.

We invite you to submit complete papers in pdf format for consideration. We will also consider one-page abstracts (in txt or pdf format) although more complete papers are preferred. We welcome submissions for both full talks and abbreviated "lightning talks." Both academic and practitioner proposals related to R are encouraged.

All slides will be made publicly available at conference time. Presenters are strongly encouraged to provide working R code to accompany the slides. Data sets should also be made public for the purposes of reproducibility (though we realize this may be limited due to contracts with data vendors). Preference may be given to presenters who have released R packages.

The conference will award two (or more) $1000 prizes for best papers. A submission must be a full paper to be eligible for a best paper award. Extended abstracts, even if a full paper is provided by conference time, are not eligible for a best paper award. Financial assistance for travel and accommodation may be available to presenters, however requests must be made at the time of submission. Assistance will be granted at the discretion of the conference committee.

Please make your submission online at: http://www.cvent.com/d/t4qy73. The submission deadline is January 31, 2015. Submitters will be notified via email by February 28, 2015 of acceptance, presentation length, and financial assistance (if requested).

Additional details will be announced via the conference website as they become available. Information on previous years' presenters and their presentations are also at the conference website.

For the program committee:

Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson, Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich ]]>

The comments below are based on my personal experience. If I don't comment on a seminar or presentation, it doesn't mean I didn't like it or it wasn't good; it may have been over my head or I may have been distracted with my duties as a committee member. All the currently available conference slides are available on the website.

I went to Dirk Eddelbuettel's seminar because I may be writing a R package to query Deltix's TimeBase database. Deltix provides a C++ API, so this is a perfect opportunity to use Rcpp.

The first presentation was given by keynote Luke Tierney, who discussed recent and upcoming performance improvements to R, and introduced some new profiling tools in his proftools package (and a new proftools-GUI package).

Yang Lu explored the low-risk anomaly on high/low volatility portfolios with similar industry, size, and volume. Avery Moon discussed how they use R at Wealthfront to run cashflow simulations for their tax-loss harvesting strategy. Steven Pav used math and memes to discuss portfolio inference. Tobias Setz used the Bayesian Change Point method to analyze time series stability.

Paul Teetor and Matthew Clegg discussed different aspects of pairs trading. Kent Hoxsey demonstrated a simple way to explore trading signal expectation. Matthew Barry introduced the pbo package, which implements some of the ideas in the paper, The Probability of Backtest Overfitting.

Alexios Ghalanos was the day's second keynote, and he discussed smooth transition autoregressive models and his new package, twinkle. Alexios wrote a post discussing his presentation, which you should definitely read.

During the two-hour conference reception at UIC, I had some drinks and hors d'ouvres, talked with speakers, and meet people I encouraged to attend and/or present. Next was the (optional) dinner at The Terrace at Trump. It was cold and windy

The first presentation was a lightning talk by Chirag Anand, where he introduced the eventstudies package, which is very well done. Casey King gave an incredibly informative and entertaining presentation on anti-money laundering and suspicious activity reporting in penny stocks using message board posts. Bryan Lewis introduced his IRL package and ran a 16 million node network analysis in < 2 minutes on his Chromebook, during his talk. Stephen Rush discussed his work on VPIN (volume synchronized probability of informed trading), while competing with Steven Pav for the "presentation with the most memes".

Bob McDonald gave the third keynote presentation, where he discussed using R to teach derivatives in MBA classes. He also explained his decision to adopt R in terms of valuing an option. Eric Zivot discussed his upcoming book, "Modeling Financial Time Series with

Bill Cleveland gave the final keynote and talked about the "divide-and-recombine" method for large, complex data, using R and Hadoop. Gregor Kastner introduced his stochvol package, and Matthew Dixon showed how to calibrate stochastic volatility models using his "alpha" gpusvcalibration package. Dirk Eddelbuettel closed the conference with a lightning talk on his recently-released RcppRedis package.

The committee also presented the awards for best papers. The winners were:

*Portfolio inference with this one weird trick*, Steven E. Pav*Dealing with Stochastic Volatility in Time Series Using the R Package stochvol*, Gregor Kastner*Re-Evaluation of the Low-Risk Anomaly in Finance via Matching*, Yang Lu, Daniel Wu, Kwok Yu*All words are not equal: Sentiment dynamics and information content within CEO letters*, Kris Boudt, James Thewissen

As always, the conference ended with one more trip to Jaks Tap. I spent some time giving college students some advice about starting their careers, and discussed the presentation I gave earlier in the week at the Chicago R User Group on Profiling for Speed.

Last, but not least: none of this would be possible without the support of fantastic sponsors:

International Center for Futures and Derivatives at UIC, Revolution Analytics, MS-Computational Finance at University of Washington, OneMarketData, RStudio, TIBCO, SYMMS, and paradigm4. ]]>

`portfolio.spec`

, `add.constraint`

, and `add.objective`

. ```
library(PortfolioAnalytics)
```

data(edhec)

returns <- edhec[, 1:6]

funds <- colnames(returns)

Here we create a portfolio object with `portfolio.spec`

. The `assets`

argument is a required argument to the `portfolio.spec`

function. `assets`

can be a character vector with the names of the assets, a named numeric vector, or a scalar value specifying the number of assets. If a character vector or scalar value is passed in for `assets`

, equal weights will be created for the initial portfolio weights. ```
init.portfolio <- portfolio.spec(assets = funds)
```

The `portfolio`

object is an S3 class that contains portfolio level data as well as the constraints and objectives for the optimization problem. You can see that the constraints and objectives lists are currently empty, but we will add sets of constraints and objectives with `add.constraint`

and `add.objective`

. ```
print.default(init.portfolio)
```

```
## $assets
```

## Convertible Arbitrage CTA Global Distressed Securities

## 0.1667 0.1667 0.1667

## Emerging Markets Equity Market Neutral Event Driven

## 0.1667 0.1667 0.1667

##

## $category_labels

## NULL

##

## $weight_seq

## NULL

##

## $constraints

## list()

##

## $objectives

## list()

##

## $call

## portfolio.spec(assets = funds)

##

## attr(,"class")

## [1] "portfolio.spec" "portfolio"

Here we add the full investment constraint. The full investment constraint is a special case of the leverage constraint that specifies the weights must sum to 1 and is specified with the alias `type="full_investment"`

as shown below. ```
init.portfolio <- add.constraint(portfolio = init.portfolio, type = "full_investment")
```

Now we add box constraint to specify a long only portfolio. The long only constraint is a special case of a box constraint where the lower bound of the weights of each asset is equal to 0 and the upper bound of the weights of each asset is equal to 1. This is specified with `type="long_only"`

as shown below. The box constraint also allows for per asset weights to be specified. ```
init.portfolio <- add.constraint(portfolio = init.portfolio, type = "long_only")
```

The following constraint types are supported:
- leverage
- box
- group
- position_limit
^{1} - turnover
^{2} - diversification
- return
- factor_exposure
- transaction_cost
^{2}

- Not supported for problems formulated as quadratic programming problems solved with
`optimize_method="ROI"`

. - Not supported for problems formulated as linear programming problems solved with
`optimize_method="ROI"`

.

`init.portfolio`

and adds the objectives specified below to `minSD.portfolio`

and `meanES.portfolio`

while leaving `init.portfolio`

unchanged. This is useful for testing multiple portfolios with different objectives using the same constraints because the constraints only need to be specified once and several new portfolios can be created using an initial portfolio object. ```
# Add objective for portfolio to minimize portfolio standard deviation
```

minSD.portfolio <- add.objective(portfolio=init.portfolio,

type="risk",

name="StdDev")

# Add objectives for portfolio to maximize mean per unit ES

meanES.portfolio <- add.objective(portfolio=init.portfolio,

type="return",

name="mean")

meanES.portfolio <- add.objective(portfolio=meanES.portfolio,

type="risk",

name="ES")

Note that the `name`

argument in `add.objective`

can be any valid R function. Several functions are provided in the PerformanceAnalytics package that can be specified as the `name`

argument such as ES/ETL/CVaR, StdDev, etc. The following objective types are supported:
- return
- risk
- risk_budget
- weight_concentration

`add.constraint`

and `add.objective`

functions were designed to be very flexible and modular so that constraints and objectives can easily be specified and added to `portfolio`

objects. PortfolioAnalytics provides a `print`

method so that we can easily view the assets, constraints, and objectives that we have specified for the portfolio. ```
print(minSD.portfolio)
```

```
## **************************************************
```

## PortfolioAnalytics Portfolio Specification

## **************************************************

##

## Call:

## portfolio.spec(assets = funds)

##

## Assets

## Number of assets: 6

##

## Asset Names

## [1] "Convertible Arbitrage" "CTA Global" "Distressed Securities"

## [4] "Emerging Markets" "Equity Market Neutral" "Event Driven"

##

## Constraints

## Number of constraints: 2

## Number of enabled constraints: 2

## Enabled constraint types

## - full_investment

## - long_only

## Number of disabled constraints: 0

##

## Objectives

## Number of objectives: 1

## Number of enabled objectives: 1

## Enabled objective names

## - StdDev

## Number of disabled objectives: 0

```
print(meanES.portfolio)
```

```
## **************************************************
```

## PortfolioAnalytics Portfolio Specification

## **************************************************

##

## Call:

## portfolio.spec(assets = funds)

##

## Assets

## Number of assets: 6

##

## Asset Names

## [1] "Convertible Arbitrage" "CTA Global" "Distressed Securities"

## [4] "Emerging Markets" "Equity Market Neutral" "Event Driven"

##

## Constraints

## Number of constraints: 2

## Number of enabled constraints: 2

## Enabled constraint types

## - full_investment

## - long_only

## Number of disabled constraints: 0

##

## Objectives

## Number of objectives: 2

## Number of enabled objectives: 2

## Enabled objective names

## - mean

## - ES

## Number of disabled objectives: 0

Now that we have portfolios set up with the desired constraints and objectives, we use `optimize.portfolio`

to run the optimizations. The examples below use `optimize_method="ROI"`

, but several other solvers are supported including the following: - DEoptim (differential evolution)
- random portfolios
- sample
- simplex
- grid

- GenSA (generalized simulated annealing)
- pso (particle swarm optimization)
- ROI (R Optimization Infrastructure)
- Rglpk
- quadprog

`optimize_method="ROI"`

. ```
# Run the optimization for the minimum standard deviation portfolio
```

minSD.opt <- optimize.portfolio(R = returns, portfolio = minSD.portfolio,

optimize_method = "ROI", trace = TRUE)

print(minSD.opt)

```
## ***********************************
```

## PortfolioAnalytics Optimization

## ***********************************

##

## Call:

## optimize.portfolio(R = returns, portfolio = minSD.portfolio,

## optimize_method = "ROI", trace = TRUE)

##

## Optimal Weights:

## Convertible Arbitrage CTA Global Distressed Securities

## 0.0000 0.0652 0.0000

## Emerging Markets Equity Market Neutral Event Driven

## 0.0000 0.9348 0.0000

##

## Objective Measure:

## StdDev

## 0.008855

The objective to maximize mean return per ES can be formulated as a linear programming problem and can be solved quickly with `optimize_method="ROI"`

. ```
# Run the optimization for the maximize mean per unit ES
```

meanES.opt <- optimize.portfolio(R = returns, portfolio = meanES.portfolio,

optimize_method = "ROI", trace = TRUE)

print(meanES.opt)

```
## ***********************************
```

## PortfolioAnalytics Optimization

## ***********************************

##

## Call:

## optimize.portfolio(R = returns, portfolio = meanES.portfolio,

## optimize_method = "ROI", trace = TRUE)

##

## Optimal Weights:

## Convertible Arbitrage CTA Global Distressed Securities

## 0.0000 0.2940 0.2509

## Emerging Markets Equity Market Neutral Event Driven

## 0.0000 0.4552 0.0000

##

## Objective Measure:

## mean

## 0.006635

##

##

## ES

## 0.01837

The PortfolioAnalytics package provides functions for charting to better understand the optimization problem through visualization. The `plot`

function produces a plot of of the optimal weights and the optimal portfolio in risk-return space. The optimal weights and chart in risk-return space can be plotted separately with `chart.Weights`

and `chart.RiskReward`

. ```
plot(minSD.opt, risk.col="StdDev", chart.assets=TRUE,
```

main="Min SD Optimization",

ylim=c(0, 0.0083), xlim=c(0, 0.06))

```
plot(meanES.opt, chart.assets=TRUE,
```

main="Mean ES Optimization",

ylim=c(0, 0.0083), xlim=c(0, 0.16))

This post demonstrates how to construct a portfolio object, add constraints, and add objectives for two simple optimization problems; one to minimize portfolio standard devation and another to maximize mean return per unit expected shortfall. We then run optimizations on both portfolio objects and plot the results of each portfolio optimization. Although this post demonstrates fairly simple constraints and objectives, PortfolioAnalytics supports complex constraints and objectives as well as many other features that will be covered in subsequent posts. The PortfolioAnalytics package is part of the ReturnAnalytics project on R-Forge. For additional examples and information, refer to the several vignettes and demos are provided in the package.
]]>
Building on the success of the previous conferences in 2009-2013, we expect more than 250 attendees from around the world. R users from industry, academia, and government will joining 30+ presenters covering all areas of finance with R.

We are very excited about the four keynote presentations given by Bob McDonald, Bill Cleveland, Alexios Ghalanos, and Luke Tierney. The main agenda (currently) includes 16 full presentations and 21 shorter "lightning talks". We are also excited to offer four optional pre-conference seminars on Friday morning.

The (optional) conference dinner will once-again be held at The Terrace at Trump Hotel. Overlooking the Chicago river and skyline, it is a perfect venue to continue conversations while dining and drinking.

More details of the agenda are available at:

http://www.RinFinance.com/agenda/

Registration information is available at:

http://www.RinFinance.com/register/

and can also be directly accessed by going to:

http://www.regonline.com/RFinance2014

We would to thank our 2014 Sponsors for the continued support enabling us to host such an exciting conference:

International Center for Futures and Derivatives at UIC

Revolution Analytics

MS-Computational Finance at University of Washington

OneMarketData

RStudio

TIBCO

SYMMS

paradigm4

On behalf of the committee and sponsors, we look forward to seeing you in Chicago!

Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson, Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich

]]>

If your strategy is not path-dependent, you can get a fairly substantial performance improvement by turning path-dependence off. If your strategy truly is path-dependent, keep reading...

I started working with Brian Peterson in late August of this year, and we've been working on a series of very large backtests over the last several weeks. Each backtest consisted of ~7 months of 5-second data on 72 instruments with 15-20 different configurations for each.

These backtests really pushed quantstrat to its limits. The longest-running job took 32 hours. I had some time while they were running, so I decided to profile quantstrat. I was able to make some substantial improvements, so I thought I'd write a post to tell you what's changed and highlight some of the performance gains we're seeing.

The biggest issue was how the internal function

instead of this:ruleFunction(c(50.04, 50.23, 50.42, 50.37, 50.24, 50.13, 50.12, 50.42, 50.42, 50.37, 50.24, 50.22, 49.95, 50.23, 50.26, 50.22, 50.11, 49.99, 50.12, 50.4, 50.33, 50.33, 50.18, 49.99), ...)

You can only imagine how large that first call would be for a 10-million-rowruleFunction(mktdata, ...)

If you think I would be happy enough with that, you don't know me. Several other changes helped get that 2-hour run down to under 30 minutes.

- We now calculate
**periodicity(mktdata)**in**applyRules**and pass that value to**ruleOrderProc**. This avoids re-calculating that value for every order, since**mktdata**doesn't change inside**applyRules**.

- We also pass current row index value to
**ruleOrderProc**and**ruleSignal**because xts subsetting via an integer is much faster than via a POSIXct object.

**applyStrategy**only accumulates values returned from**applyIndicators**,**applySignals**, and**applyRules**if**debug=TRUE**. This saves a little time, but can save a lot of memory for large**mktdata**objects.

- The dimension reduction algorithm has to look for the first time the price crosses the limit order price. We were doing that with a call to
**which(sigThreshold(...))[1]**. The relational operators (**<**,**<=**,**>**,, and**>**=**==**) and**which**operate on the entire vector, but we only need the first value, so I replaced that code with a with C-based**.firstThreshold**function that stops as soon as it finds the first cross.

All these changes are most significant for large data sets. The small demo strategies included with quantstrat are also faster, but the net performance gains increase as the size of the data, the number of signals (and therefore the number of rule evaluations), and number of instruments increases.

You're still reading? What are you waiting for? Go install the latest from R-Forge and try it for yourself!

]]>
You're still reading? What are you waiting for? Go install the latest from R-Forge and try it for yourself!

R/Finance 2014: Applied Finance with R

May 16 and 17, 2014

University of Illinois at Chicago

The sixth annual R/Finance conference for applied finance using R will be held on May 16 and 17, 2014 in Chicago, IL, USA at the University of Illinois at Chicago. The conference will cover topics including portfolio management, time series analysis, advanced risk tools, high-performance computing, market microstructure, and econometrics. All will be discussed within the context of using R as a primary tool for financial risk management, portfolio construction, and trading.

Over the past five years, R/Finance has included attendees from around the world. It has featured presentations from prominent academics and practitioners, and we anticipate another exciting line-up for 2014.

We invite you to submit complete papers in pdf format for consideration. We will also consider one-page abstracts (in txt or pdf format), although more complete papers are preferred. We welcome submissions for both full talks and abbreviated "lightning talks". Both academic and practitioner proposals related to R are encouraged.

Presenters are strongly encouraged to provide working R code to accompany the presentation/paper. Data sets should also be made public for the purposes of reproducibility (though we realize this may be limited due to contracts with data vendors). Preference may be given to presenters who have released R packages.

The conference will award two (or more) $1000 prizes for best papers. A submission must be a full paper to be eligible for a best paper award. Extended abstracts, even if a full paper is provided by conference time, are not eligible for a best paper award. Financial assistance for travel and accommodation may be available to presenters at the discretion of the conference committee. Requests for assistance should be made at the time of submission.

Please submit your papers or abstracts online at: goo.gl/OmKnu7. The submission deadline is January 31, 2014. Submitters will be notified via email by February 28, 2014 of acceptance, presentation length, and decisions on requested funding.

Additional details will be announced via the conference website www.RinFinance.com as they become available. Information on previous years' presenters and their presentations are also at the conference website.

For the program committee:

Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson,

Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich ]]>

The comments below are based on my personal experience. If I don't comment on a seminar or presentation, it doesn't mean I didn't like it or it wasn't good; it may have been over my head or I may have been distracted with my duties as a committee member. All the currently available conference slides are available on the website.

I went to (and live-tweeted) Jeff Ryan's seminar because I wanted to learn more about how he uses mmap+indexing with options data. There I realized that POSIXlt components use a zero-based index because they mirror the underlying tm struct, and that mmap+indexing files can be shared across cores and you can read them from other languages (e.g. Python).

The first presentation was by keynote Ryan Sheftel, who talked about how he uses R on his bond trading desk. David Ardia showed how expected returns can be estimated via the covariance matrix. Ronald Hochreiter gave an overview of modeling optimization via his modopt package. Tammer Kamel gave a live demo of the Quandl package and said, "Quandl hopes to do to Bloomberg what Wikipedia did to Britannica."

I had the pleasure of introducing both Doug Martin, who talked about robust covariance estimation, and Giles Heywood, who discussed several ways of estimating and forecasting covariance, and proposed an "open source equity risk and backtest system" as a means of matching talent with capital.

Ruey Tsay was the next keynote, and spoke about using principal volatility components to simplify multivariate volatility modeling. Alexios Ghalanos spoke about modeling multivariate time-varying skewness and kurtosis. Unfortunately, I missed both Kris Boudt's and David Matteson's presentations, but I did get to see Winston Chang's live demo of Shiny.

The two-hour conference reception at UIC was a great time to have a drink, talk with speakers, and say hello to people I had never met in person. Next was the (optional) dinner at The Terrace at Trump. Unfortunately, it was cold and windy, so we only spent 15-20 minutes on the terrace before moving inside. The food was fantastic, but the conversations were even better.

I missed the first block of lightning talks. Samantha Azzarello discussed her work with Blu Putnam, which used a dynamic linear model to evaluate the Fed's performance vis-a-vis the Taylor Rule. Jiahan Li used constrained least squares on 4 economic fundamentals to forecast foreign exchange rates. Thomas Harte talked about regulatory requirements of foreign exchange pricing (and wins the award for most slides, 270); basically documentation is important, Sweave to the rescue!

Sanjiv Das gave a keynote on 4 applications: 1) network analysis on SEC and FDIC filings to determine banks that pose systematic risk, 2) determining which home mortgage modification is optimal, 3) portfolio optimization with mental accounting, 4) venture capital communities.

I had the pleasure of introducing the following speakers: Dirk Eddelbuettel showed how it's easy to write fast linear algebra code with RcppArmadillo. Klaus Spanderen showed how to use QuantLib from R, and even how to to call C++ from R from C++. Bryan Lewis talked about SciDB and the scidb package (SciDB contains fast linear algebra routines that operate **on** the database!). Matthew Dowle gave an introduction to data.table (in addition to a seminar).

Attilio Meucci gave his keynote on visualizing advanced risk management and portfolio optimization. Immediately following, Brian Peterson gave a lightning on implementing Meucci's work in R (Attilio works in Matlab), which was part of a Google Summer of Code project last year.

Thomas Hanson presented his work with Don Chance (and others) on computational issues in estimating the volatility smile. Jeffrey Ryan showed how to manipulate options data in R with the greeks package.

The conference wrapped up by giving away three books, generously donated by Springer, to three random people who submitted feedback surveys. I performed the random drawing live on stage, using my patent-pending TUMC method (I tossed the papers up in the air).

The committee also presented the awards for best papers. The winners were:

*Regime switches in volatility and correlation of ﬁnancial institutions*, Boudt et. al.*A Bayesian interpretation of the Federal Reserve's dual mandate and the Taylor Rule*, Putnam & Azzarello*Nonparametric Estimation of Stationarity and Change Points in Finance*, Matteson et. al.*Estimating High Dimensional Covariance Matrix Using a Factor Model*, Sun (best student paper)

Last, but not least: none of this would be possible without the support of fantastic sponsors: International Center for Futures and Derivatives at UIC, Revolution Analytics, MS-Computational Finance at University of Washington, Google, lemnica, OpenGamma, OneMarketData, and RStudio.

Building on the success of the previous conferences in 2009, 2010, 2011 and 2012, we expect more than 250 attendees from around the world. R users from industry, academia, and government will joining 30+ presenters covering all areas of finance with R.

We are very excited about the four keynotes by Sanjiv Das, Attilio Meucci, Ryan Sheftel, and Ruey Tsay. The main agenda (currently) includes seventeen full presentations and fifteen shorter "lightning talks". We are also excited to offer five optional pre-conference seminars on Friday morning.

To celebrate the fifth year of the conference in style, the dinner will be held at The Terrace at Trump Hotel. Overlooking the Chicago river and skyline, it is a perfect venue to continue conversations while dining and drinking.

More details of the agenda are available at:

http://www.RinFinance.com/agenda/

Registration information is available at:

http://www.RinFinance.com/register/

and can also be directly accessed by going to:

http://www.regonline.com/RFinance2013

We would to thank our 2013 Sponsors for the continued support enabling us to host such an exciting conference:

International Center for Futures and Derivatives at UIC

Revolution Analytics

MS-Computational Finance at University of Washington

lemnica

OpenGamma

OneMarketData

RStudio

On behalf of the committee and sponsors, we look forward to seeing you in Chicago!

Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson, Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich

]]>