All three providers made breaking changes to their URLs/interfaces.

getSymbols.google also got some love. It now honors all arguments set via setSymbolLookup (#138), and it correctly parses the date column in non-English locales (#140).

There's a handy new argument to getDividends: split.adjust. It allows you to request dividends unadjusted for splits (#128). Yahoo provides split-adjusted dividends, so you previously had to manually unadjust them for splits if you wanted the original raw values. To import the raw unadjusted dividends, just call:

rawDiv <- getDividends("IBM", split.adjust = FALSE)

Note that the default is split.adjust = TRUE to maintain backward-compatibility. ]]>

A quantmod user asked an interesting question on StackOverflow: Looping viewFinancials from quantmod. Basically, they wanted to create a `data.frame`

that contained financial statement data for several companies for several years. I answered their question, and thought others might find the function I wrote useful… hence, this post!

I called the function `stackFinancials()`

because it would use `getFinancials()`

and `viewFinancials()`

to pull financial statement data for multiple symbols, and stack them together in long form. I chose a long data format because I don’t know whether the output of `viewFinancials()`

always has the same number of rows and columns for a given `type`

and `period`

. The long format makes it easy to put all the data in one object.

```
stackFinancials <-
```

function(symbols, type = c("BS", "IS", "CF"), period = c("A", "Q")) {

# Ensure the type and period arguments match viewFinancials

type <- match.arg(toupper(type[1]), c("BS", "IS", "CF"))

period <- match.arg(toupper(period[1]), c("A", "Q"))

# Simple function to get financials for one symbol

getOne <- function(symbol, type, period) {

gf <- getFinancials(symbol, auto.assign = FALSE)

vf <- viewFinancials(gf, type = type, period = period)

# Put viewFinancials output into a data.frame

df <- data.frame(vf, line.item = rownames(vf), type = type,

period = period, symbol = symbol,

stringsAsFactors = FALSE, check.names = FALSE)

# Reshape data.frame into long format

long <- reshape(df, direction="long", varying=seq(ncol(vf)),

v.names="value", idvar="line.item",

times=colnames(vf))

# Reset row.names to "automatic"

rownames(long) <- NULL

# Return data

long

}

# Loop over all symbols

allData <- lapply(symbols, getOne, type = type, period = period)

# rbind() all into one data.frame

do.call(rbind, allData)

}

Here’s a simple example of how to use `stackFinancials()`

to pull the quarterly (`period = "Q"`

) income statements (`type = "IS"`

) for General Electric and Apple:

```
library(quantmod)
```

Data <- stackFinancials(c("GE", "AAPL"), type = "IS", period = "Q")

head(Data, 4)

```
## line.item type period symbol time value
```

## 1 Revenue IS Q GE 2016-12-31 33088

## 2 Other Revenue, Total IS Q GE 2016-12-31 NA

## 3 Total Revenue IS Q GE 2016-12-31 33088

## 4 Cost of Revenue, Total IS Q GE 2016-12-31 24775

Now that we have the output in `Data`

, let’s do something with it. You could simply subset `Data`

to extract the components you want. For example, if you wanted to look at Apple’s quarterly revenue, you could subset `Data`

where `symbol == "AAPL"`

and `line.item == "Total Revenue"`

. But if you’re going to slicing-and-dicing a lot, it can often help to write a general function to simplify things. So I wrote `extractLineItem()`

. It takes the output of `stackFinancials()`

and a regular expression of the line item you want, and it returns an xts object that contains the given line items for all symbols in the data.

```
extractLineItem <- function(stackedFinancials, line.item) {
```

if (missing(stackedFinancials) || missing(line.item)) {

stop("You must provide output from stackFinancials(),",

"and the line.item to extract")

}

# Select line items matching user input

match.rows <- grepl(line.item, Data$line.item, ignore.case = TRUE)

sfSubset <- Data[match.rows,]

getItem <- function(x) {

# Create xts object

output <- xts(x$value, as.yearmon(x$time))

# Ensure column names are syntactically valid

valid.names <- make.names(paste(x$symbol[1], x$line.item[1]))

# Remove repeating periods

colnames(output) <- gsub("\\.+", "\\.", valid.names)

output

}

# Split subset by line.item and symbol

symbol.item <- split(sfSubset, sfSubset[, c("symbol", "line.item")])

# Apply getItem() to each chunk, and merge into one object

do.call(merge, lapply(symbol.item, getItem))

}

Let’s use `extractLineItem()`

to compare total revenue for GE and AAPL.

```
totalRevenue <- extractLineItem(Data, "total revenue")
```

totalRevenue

```
## AAPL.Total.Revenue GE.Total.Revenue
```

## Dec 2015 75872 24654

## Mar 2016 50557 27845

## Jun 2016 42358 61339

## Sep 2016 46852 90605

## Dec 2016 78351 33088

`plot(totalRevenue, main = "Quarterly Total Revenue, AAPL (black) vs GE (red)")`

You could also combine multiple calls to `extractLineItem()`

to calculate ratios not included in the output from `viewFinancials()`

. For example, you could divide operating income by total revenue to calculate operating margin.

```
operatingIncome <- extractLineItem(Data, "operating income")
```

operatingIncome

```
## AAPL.Operating.Income GE.Operating.Income
```

## Dec 2015 24171 2863

## Mar 2016 13987 545

## Jun 2016 10105 4736

## Sep 2016 11761 6138

## Dec 2016 23359 2892

`plot(operatingIncome / totalRevenue, main = "Quarterly Operating Margin, AAPL (black) vs GE (red)")`

R/Finance 2017: Applied Finance with R

May 19 and 20, 2017

University of Illinois at Chicago

The ninth annual R/Finance conference for applied finance using R will be held on May 19 and 20, 2017 in Chicago, IL, USA at the University of Illinois at Chicago. The conference will cover topics including portfolio management, time series analysis, advanced risk tools, high-performance computing, market microstructure, and econometrics. All will be discussed within the context of using R as a primary tool for financial risk management, portfolio construction, and trading.

Over the past eight years, R/Finance has included attendees from around the world. It has featured presentations from prominent academics and practitioners, and we anticipate another exciting line-up for 2017.

We invite you to submit complete papers in pdf format for consideration. We will also consider one-page abstracts (in txt or pdf format) although more complete papers are preferred. We welcome submissions for both full talks and abbreviated "lightning talks." Both academic and practitioner proposals related to R are encouraged.

All slides will be made publicly available at conference time. Presenters are strongly encouraged to provide working R code to accompany the slides. Data sets should also be made public for the purposes of reproducibility (though we realize this may be limited due to contracts with data vendors). Preference may be given to presenters who have released R packages.

Financial assistance for travel and accommodation may be available to presenters, however requests must be made at the time of submission. Assistance will be granted at the discretion of the conference committee.

Please submit proposals online at http://go.uic.edu/rfinsubmit. Submissions will be reviewed and accepted on a rolling basis with a final deadline of February 28, 2017. Submitters will be notified via email by March 31, 2017 of acceptance, presentation length, and financial assistance (if requested).

Additional details will be announced via the conference website www.RinFinance.com as they become available. Information on previous years' presenters and their presentations are also at the conference website. We will make a separate announcement when registration opens.

For the program committee:

Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson, Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich

Oanda changed their URL format from http to https, and getSymbols.oanda did not follow the redirect. Yahoo Finance changed the HTML for displaying options data, which broke getOptionChain.yahoo. The fix downloads JSON instead of scraping HTML, so hopefully it will be less likely to break. For more information, see the links to the GitHub issues above.

I added documentation for getPrice (#77), and removed the unused unsetSymbolLookup function and corresponding documentation (#115).

]]>

Subject: Data Mining Tutorial, R/Finance course series, and more!I'm excited to announce that I'm working on a course for this new series! It will provide an introduction to importing and managing financial data.

R/Finance - A new course series in the works

We are working on a whole new course series on applied finance using R. This new series will cover topics such as time series (David S. Matteson), portfolio analysis (Kris Boudt), the xts and zoo packages (Jeffrey Ryan), and much more. Start our first course Intro to Credit Risk Modeling in R today.

If you've ever done anything with financial or economic time series, you know the data come in various shapes, sizes, and periodicities. Getting the data into R can be stressful and time-consuming, especially when you need to merge data from several different sources into one data set. This course will cover importing data from local files as well as from internet sources.

The tentative course outline is below. I'd really appreciate your feedback on what should be included in this introductory course! So let me know if I've omitted something, or if you think any of the topics are too advanced.

- Introduction and downloading data
- getSymbols design overview, Quandl
- Finding and downloading data from internet sources
- E.g. getSymbols.yahoo, getSymbols.FRED, Quandl
- Loading and transforming multiple instruments
- Checking for errors (i.e. summary stats, visualizing)
- Managing data from multiple sources
- Setting per-instrument sources and default arguments
- setSymbolLookup, saveSymbolLookup, loadSymbolLookup, setDefaults
- Handling instruments names that clash or are not valid R object names
- Aligning data with different periodicities
- Making irregular data regular
- Aggregating to lowest frequency
- Combining monthly with daily
- Combining daily with intraday
- Storing and updating data
- Creating an initial RData-backed storage
- Adjusting financial time-series
- Handling errors during update process

Note that registration fees

The conference will take place on May 20 and 21, at UIC in Chicago. Building on the success of the previous conferences in 2009-2015, we expect more than 250 attendees from around the world. R users from industry, academia, and government will joining 50 presenters covering all areas of finance with R.

We are very excited about the four keynote presentations given by Patrick Burns, Frank Diebold, Tarek Eldin, and Rishi Narang. The conference agenda (currently) includes 17 full presentations and 33 shorter "lightning talks". As in previous years, several (optional) pre-conference seminars are offered on Friday morning.

There is also an (optional) conference dinner at The Riverside Room and Gallery at Trump. Situated directly on the hotel's new River Walk, it is a perfect venue to continue conversations while dining and drinking.

We would to thank our 2016 Sponsors for the continued support enabling us to host such an exciting conference:

UIC Liautaud Master of Science in Finance

Microsoft

MS-Computational Finance and Risk Management at University of Washington

Charles Schwab

Hull Investments

Interactive Brokers

OneMarketData

RStudio

On behalf of the committee and sponsors, we look forward to seeing you in Chicago!

Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson, Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich

]]>

I completely agree with the first point. I'm not sure Mike considers the output of SharpeRatio.annualized with geometric=TRUE to be suspect (he doesn't elaborate). The overnightRets are calculated as arithmetic returns, so it's proper to aggregate them using geometric chaining (i.e. multiplication).

- The R backtest assumes fractional shares. This means that equity is fully invested at each new position. This is important because it affects drawdown calculations.
- When calculating the Sharpe ratio, the “geometric = FALSE” option must be used otherwise the result may not be correct. It took some time to figure that out.
- The profit factor result in R does not reconcile with results from other platforms or even from excel. PF in R is shown as 1.23 but the correct value is 1.17. Actually, the profit factor is calculated on a per share basis in R, although returns are geometric.

I also agree with the third point, because the R code used to calculate profit factor is wrong. My main impetus to write this post was to provide a corrected profit factor calculation. The calculation (with slightly modified syntax) in Mike's post is:

require(quantmod)

getSymbols('SPY', from = '1900-01-01')

SPY <- adjustOHLC(SPY, use.Adjusted=TRUE)

overnightRets <- na.omit(Op(SPY)/lag(Cl(SPY)) - 1)

posRet <- overnightRets > 0

profitFactor <- -sum(overnightRets[posRet])/sum(overnightRets[!posRet])

Note that profit factor in the code above is calculated by summing positive and negative *returns*, when it should be calculated using positive and negative *P&L*. In order to do that, we need to calculate the equity curve and then take its first difference to get P&L. The corrected calculation is below, and it provides the correct result Mike expected.

grossPnL <- diff(grossEquity)

grossProfit <- sum(grossPnL[grossPnL > 0])

grossLoss <- sum(grossPnL[grossPnL < 0])

profitFactor <- grossProfit / abs(grossLoss)

I'd also like to respond to Mike's comment:

Since in the past I have identified serious flaws in commercially available backtesting platforms, I would not be surprised if some of the R libraries have some flaws.

I'm certain all of the backtesting R packages have flaws/bugs. All software has bugs because all software is written by fallible humans. One nice thing about (most) R packages is that they're open source, which means anyone/everyone can check the code for bugs, and fix any bugs that are found. With closed-source software, commercial or not, you depend on the vendor to deliver a patched version at their discretion and in their timing.

Now, I'm not making an argument that open source software is inherently better. I simply wanted to point out this one difference. As much as I love open source software, there are times where commercial vendor-supported software presents a more appealing set of tradeoffs than using open source software. Each situation is different.

]]>
James Toll provided a patch to the volatility function that uses a zero mean (instead of the sample mean) in close-to-close volatility. The other big change is that moving average functions no longer return objects with column names based on the input object column names. There are many other bug fixes (see the CHANGES file in the package).

The biggest changes in quantmod were to fix getSymbols.MySQL to use the correct dbConnect call based on changes made in RMySQL_0.10 and to fix getSymbols.FRED to use https:// instead of http:// when downloading FRED data. getSymbols.csv also got some much-needed love.

I'd also like to mention that development has moved to GitHub for both TTR and quantmod.

This new engine improves the functionality, modularity, and flexibility of

The main objective was to provide functionality similar to

- Basic time series plots with sensible defaults
- Plotting xts objects by column "automagically" as separate panels
- Small multiples with multiple pages
- "Layout-safe" so multiple specifications/panels can be charted in a single device
- Easily add data to an existing plot or add panels similar to
**quantmod::add*** - Event lines

The new version of

Note that the new

The conference will take place on May 29 and 30, at UIC in Chicago. Building on the success of the previous conferences in 2009-2014, we expect more than 250 attendees from around the world. R users from industry, academia, and government will joining 30+ presenters covering all areas of finance with R.

We are very excited about the four keynote presentations given by Emanuel Derman, Louis Marascio, Alexander McNeil, and Rishi Narang. The main agenda (currently) includes 18 full presentations and 19 shorter "lightning talks". As in previous years, several (optional) pre-conference seminars are offered on Friday morning.

There is also an (optional) conference dinner that will once-again be held at The Terrace at Trump Hotel. Overlooking the Chicago river and skyline, it is a perfect venue to continue conversations while dining and drinking.

We would to thank our 2015 sponsors for the continued support enabling us to host such an exciting conference:

International Center for Futures and Derivatives at UIC

Revolution Analytics

MS-Computational Finance at University of Washington

OneMarketData

Ketchum Trading

RStudio

SYMMYS

On behalf of the committee and sponsors, we look forward to seeing you in Chicago!

For the program committee:

Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson, Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich

]]>

Changes to the Yahoo Finance and Oanda websites broke the getOptionChain.yahoo and getSymbols.oanda functions, respectively. I didn’t use getOptionChain.yahoo much, so I’m not certain I restored all the prior functionality. Let me know if there’s something I missed. I’d be glad to add a test case for that, or to add a test you’ve written.

The getSymbols.yahooj function is a major enhancement provided by Wouter Thielen. It allows quantmod users to pull stock data from Yahoo Finance Japan.

Japanese ticker symbols usually start with a number and it is cumbersome to use variable names that start with a number in the R environment, so the string "YJ" will be prepended to each of the Symbols. I recommend using setSymbolLookup to prepend the ticker symbols with “YJ” yourself, so you can just use the main getSymbols function.

For example, if you want to pull Sony data, you would run:

require(quantmod)

setSymbolLookup(YJ6758.T='yahooj')

getSymbols('YJ6758.T')

The full list of supported data sources for quantmod is now: Yahoo Finance-US, Yahoo Finance-Japan, Google Finance, csv, RData (including rds and rda), FRED, SQLite, MySQL, and Oanda.

Contributions to add support for additional data sources are welcomed. The existing getSymbols functions are good templates to start from.

The full list of supported data sources for quantmod is now: Yahoo Finance-US, Yahoo Finance-Japan, Google Finance, csv, RData (including rds and rda), FRED, SQLite, MySQL, and Oanda.

Contributions to add support for additional data sources are welcomed. The existing getSymbols functions are good templates to start from.

]]>

If you’re interested in participating as a student or a mentor, there's an overview of the GSoC program on The R Project GSoC 2015 Wiki. The wiki also includes a timeline and links to prior year's projects.

Several mentors from various backgrounds have already proposed projects for students to work on this summer. Mentors have until March 9th to submit projects they would be willing to support, and student applications begin on March 16th. ]]>

There are also several bug fixes. A few worth noting are:

]]>

R/Finance 2015: Applied Finance with R

May 29 and 30, 2015

University of Illinois at Chicago

The seventh annual R/Finance conference for applied finance using R will be held on May 29 and 30, 2015 in Chicago, IL, USA at the University of Illinois at Chicago. The conference will cover topics including portfolio management, time series analysis, advanced risk tools, high-performance computing, market microstructure, and econometrics. All will be discussed within the context of using R as a primary tool for financial risk management, portfolio construction, and trading.

Over the past six years, R/Finance has included attendees from around the world. It has featured presentations from prominent academics and practitioners, and we anticipate another exciting line-up for 2015. This year will include invited keynote presentations by Emanuel Derman, Louis Marascio, Alexander McNeil, and Rishi Narang.

We invite you to submit complete papers in pdf format for consideration. We will also consider one-page abstracts (in txt or pdf format) although more complete papers are preferred. We welcome submissions for both full talks and abbreviated "lightning talks." Both academic and practitioner proposals related to R are encouraged.

All slides will be made publicly available at conference time. Presenters are strongly encouraged to provide working R code to accompany the slides. Data sets should also be made public for the purposes of reproducibility (though we realize this may be limited due to contracts with data vendors). Preference may be given to presenters who have released R packages.

The conference will award two (or more) $1000 prizes for best papers. A submission must be a full paper to be eligible for a best paper award. Extended abstracts, even if a full paper is provided by conference time, are not eligible for a best paper award. Financial assistance for travel and accommodation may be available to presenters, however requests must be made at the time of submission. Assistance will be granted at the discretion of the conference committee.

Please make your submission online at: http://www.cvent.com/d/t4qy73. The submission deadline is January 31, 2015. Submitters will be notified via email by February 28, 2015 of acceptance, presentation length, and financial assistance (if requested).

Additional details will be announced via the conference website as they become available. Information on previous years' presenters and their presentations are also at the conference website.

For the program committee:

Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson, Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich ]]>

The comments below are based on my personal experience. If I don't comment on a seminar or presentation, it doesn't mean I didn't like it or it wasn't good; it may have been over my head or I may have been distracted with my duties as a committee member. All the currently available conference slides are available on the website.

I went to Dirk Eddelbuettel's seminar because I may be writing a R package to query Deltix's TimeBase database. Deltix provides a C++ API, so this is a perfect opportunity to use Rcpp.

The first presentation was given by keynote Luke Tierney, who discussed recent and upcoming performance improvements to R, and introduced some new profiling tools in his proftools package (and a new proftools-GUI package).

Yang Lu explored the low-risk anomaly on high/low volatility portfolios with similar industry, size, and volume. Avery Moon discussed how they use R at Wealthfront to run cashflow simulations for their tax-loss harvesting strategy. Steven Pav used math and memes to discuss portfolio inference. Tobias Setz used the Bayesian Change Point method to analyze time series stability.

Paul Teetor and Matthew Clegg discussed different aspects of pairs trading. Kent Hoxsey demonstrated a simple way to explore trading signal expectation. Matthew Barry introduced the pbo package, which implements some of the ideas in the paper, The Probability of Backtest Overfitting.

Alexios Ghalanos was the day's second keynote, and he discussed smooth transition autoregressive models and his new package, twinkle. Alexios wrote a post discussing his presentation, which you should definitely read.

During the two-hour conference reception at UIC, I had some drinks and hors d'ouvres, talked with speakers, and meet people I encouraged to attend and/or present. Next was the (optional) dinner at The Terrace at Trump. It was cold and windy

The first presentation was a lightning talk by Chirag Anand, where he introduced the eventstudies package, which is very well done. Casey King gave an incredibly informative and entertaining presentation on anti-money laundering and suspicious activity reporting in penny stocks using message board posts. Bryan Lewis introduced his IRL package and ran a 16 million node network analysis in < 2 minutes on his Chromebook, during his talk. Stephen Rush discussed his work on VPIN (volume synchronized probability of informed trading), while competing with Steven Pav for the "presentation with the most memes".

Bob McDonald gave the third keynote presentation, where he discussed using R to teach derivatives in MBA classes. He also explained his decision to adopt R in terms of valuing an option. Eric Zivot discussed his upcoming book, "Modeling Financial Time Series with

Bill Cleveland gave the final keynote and talked about the "divide-and-recombine" method for large, complex data, using R and Hadoop. Gregor Kastner introduced his stochvol package, and Matthew Dixon showed how to calibrate stochastic volatility models using his "alpha" gpusvcalibration package. Dirk Eddelbuettel closed the conference with a lightning talk on his recently-released RcppRedis package.

The committee also presented the awards for best papers. The winners were:

*Portfolio inference with this one weird trick*, Steven E. Pav*Dealing with Stochastic Volatility in Time Series Using the R Package stochvol*, Gregor Kastner*Re-Evaluation of the Low-Risk Anomaly in Finance via Matching*, Yang Lu, Daniel Wu, Kwok Yu*All words are not equal: Sentiment dynamics and information content within CEO letters*, Kris Boudt, James Thewissen

As always, the conference ended with one more trip to Jaks Tap. I spent some time giving college students some advice about starting their careers, and discussed the presentation I gave earlier in the week at the Chicago R User Group on Profiling for Speed.

Last, but not least: none of this would be possible without the support of fantastic sponsors:

International Center for Futures and Derivatives at UIC, Revolution Analytics, MS-Computational Finance at University of Washington, OneMarketData, RStudio, TIBCO, SYMMS, and paradigm4. ]]>

`portfolio.spec`

, `add.constraint`

, and `add.objective`

. ```
library(PortfolioAnalytics)
```

data(edhec)

returns <- edhec[, 1:6]

funds <- colnames(returns)

Here we create a portfolio object with `portfolio.spec`

. The `assets`

argument is a required argument to the `portfolio.spec`

function. `assets`

can be a character vector with the names of the assets, a named numeric vector, or a scalar value specifying the number of assets. If a character vector or scalar value is passed in for `assets`

, equal weights will be created for the initial portfolio weights. ```
init.portfolio <- portfolio.spec(assets = funds)
```

The `portfolio`

object is an S3 class that contains portfolio level data as well as the constraints and objectives for the optimization problem. You can see that the constraints and objectives lists are currently empty, but we will add sets of constraints and objectives with `add.constraint`

and `add.objective`

. ```
print.default(init.portfolio)
```

```
## $assets
```

## Convertible Arbitrage CTA Global Distressed Securities

## 0.1667 0.1667 0.1667

## Emerging Markets Equity Market Neutral Event Driven

## 0.1667 0.1667 0.1667

##

## $category_labels

## NULL

##

## $weight_seq

## NULL

##

## $constraints

## list()

##

## $objectives

## list()

##

## $call

## portfolio.spec(assets = funds)

##

## attr(,"class")

## [1] "portfolio.spec" "portfolio"

Here we add the full investment constraint. The full investment constraint is a special case of the leverage constraint that specifies the weights must sum to 1 and is specified with the alias `type="full_investment"`

as shown below. ```
init.portfolio <- add.constraint(portfolio = init.portfolio, type = "full_investment")
```

Now we add box constraint to specify a long only portfolio. The long only constraint is a special case of a box constraint where the lower bound of the weights of each asset is equal to 0 and the upper bound of the weights of each asset is equal to 1. This is specified with `type="long_only"`

as shown below. The box constraint also allows for per asset weights to be specified. ```
init.portfolio <- add.constraint(portfolio = init.portfolio, type = "long_only")
```

The following constraint types are supported:
- leverage
- box
- group
- position_limit
^{1} - turnover
^{2} - diversification
- return
- factor_exposure
- transaction_cost
^{2}

- Not supported for problems formulated as quadratic programming problems solved with
`optimize_method="ROI"`

. - Not supported for problems formulated as linear programming problems solved with
`optimize_method="ROI"`

.

`init.portfolio`

and adds the objectives specified below to `minSD.portfolio`

and `meanES.portfolio`

while leaving `init.portfolio`

unchanged. This is useful for testing multiple portfolios with different objectives using the same constraints because the constraints only need to be specified once and several new portfolios can be created using an initial portfolio object. ```
# Add objective for portfolio to minimize portfolio standard deviation
```

minSD.portfolio <- add.objective(portfolio=init.portfolio,

type="risk",

name="StdDev")

# Add objectives for portfolio to maximize mean per unit ES

meanES.portfolio <- add.objective(portfolio=init.portfolio,

type="return",

name="mean")

meanES.portfolio <- add.objective(portfolio=meanES.portfolio,

type="risk",

name="ES")

Note that the `name`

argument in `add.objective`

can be any valid R function. Several functions are provided in the PerformanceAnalytics package that can be specified as the `name`

argument such as ES/ETL/CVaR, StdDev, etc. The following objective types are supported:
- return
- risk
- risk_budget
- weight_concentration

`add.constraint`

and `add.objective`

functions were designed to be very flexible and modular so that constraints and objectives can easily be specified and added to `portfolio`

objects. PortfolioAnalytics provides a `print`

method so that we can easily view the assets, constraints, and objectives that we have specified for the portfolio. ```
print(minSD.portfolio)
```

```
## **************************************************
```

## PortfolioAnalytics Portfolio Specification

## **************************************************

##

## Call:

## portfolio.spec(assets = funds)

##

## Assets

## Number of assets: 6

##

## Asset Names

## [1] "Convertible Arbitrage" "CTA Global" "Distressed Securities"

## [4] "Emerging Markets" "Equity Market Neutral" "Event Driven"

##

## Constraints

## Number of constraints: 2

## Number of enabled constraints: 2

## Enabled constraint types

## - full_investment

## - long_only

## Number of disabled constraints: 0

##

## Objectives

## Number of objectives: 1

## Number of enabled objectives: 1

## Enabled objective names

## - StdDev

## Number of disabled objectives: 0

```
print(meanES.portfolio)
```

```
## **************************************************
```

## PortfolioAnalytics Portfolio Specification

## **************************************************

##

## Call:

## portfolio.spec(assets = funds)

##

## Assets

## Number of assets: 6

##

## Asset Names

## [1] "Convertible Arbitrage" "CTA Global" "Distressed Securities"

## [4] "Emerging Markets" "Equity Market Neutral" "Event Driven"

##

## Constraints

## Number of constraints: 2

## Number of enabled constraints: 2

## Enabled constraint types

## - full_investment

## - long_only

## Number of disabled constraints: 0

##

## Objectives

## Number of objectives: 2

## Number of enabled objectives: 2

## Enabled objective names

## - mean

## - ES

## Number of disabled objectives: 0

Now that we have portfolios set up with the desired constraints and objectives, we use `optimize.portfolio`

to run the optimizations. The examples below use `optimize_method="ROI"`

, but several other solvers are supported including the following: - DEoptim (differential evolution)
- random portfolios
- sample
- simplex
- grid

- GenSA (generalized simulated annealing)
- pso (particle swarm optimization)
- ROI (R Optimization Infrastructure)
- Rglpk
- quadprog

`optimize_method="ROI"`

. ```
# Run the optimization for the minimum standard deviation portfolio
```

minSD.opt <- optimize.portfolio(R = returns, portfolio = minSD.portfolio,

optimize_method = "ROI", trace = TRUE)

print(minSD.opt)

```
## ***********************************
```

## PortfolioAnalytics Optimization

## ***********************************

##

## Call:

## optimize.portfolio(R = returns, portfolio = minSD.portfolio,

## optimize_method = "ROI", trace = TRUE)

##

## Optimal Weights:

## Convertible Arbitrage CTA Global Distressed Securities

## 0.0000 0.0652 0.0000

## Emerging Markets Equity Market Neutral Event Driven

## 0.0000 0.9348 0.0000

##

## Objective Measure:

## StdDev

## 0.008855

The objective to maximize mean return per ES can be formulated as a linear programming problem and can be solved quickly with `optimize_method="ROI"`

. ```
# Run the optimization for the maximize mean per unit ES
```

meanES.opt <- optimize.portfolio(R = returns, portfolio = meanES.portfolio,

optimize_method = "ROI", trace = TRUE)

print(meanES.opt)

```
## ***********************************
```

## PortfolioAnalytics Optimization

## ***********************************

##

## Call:

## optimize.portfolio(R = returns, portfolio = meanES.portfolio,

## optimize_method = "ROI", trace = TRUE)

##

## Optimal Weights:

## Convertible Arbitrage CTA Global Distressed Securities

## 0.0000 0.2940 0.2509

## Emerging Markets Equity Market Neutral Event Driven

## 0.0000 0.4552 0.0000

##

## Objective Measure:

## mean

## 0.006635

##

##

## ES

## 0.01837

The PortfolioAnalytics package provides functions for charting to better understand the optimization problem through visualization. The `plot`

function produces a plot of of the optimal weights and the optimal portfolio in risk-return space. The optimal weights and chart in risk-return space can be plotted separately with `chart.Weights`

and `chart.RiskReward`

. ```
plot(minSD.opt, risk.col="StdDev", chart.assets=TRUE,
```

main="Min SD Optimization",

ylim=c(0, 0.0083), xlim=c(0, 0.06))

```
plot(meanES.opt, chart.assets=TRUE,
```

main="Mean ES Optimization",

ylim=c(0, 0.0083), xlim=c(0, 0.16))

This post demonstrates how to construct a portfolio object, add constraints, and add objectives for two simple optimization problems; one to minimize portfolio standard devation and another to maximize mean return per unit expected shortfall. We then run optimizations on both portfolio objects and plot the results of each portfolio optimization. Although this post demonstrates fairly simple constraints and objectives, PortfolioAnalytics supports complex constraints and objectives as well as many other features that will be covered in subsequent posts. The PortfolioAnalytics package is part of the ReturnAnalytics project on R-Forge. For additional examples and information, refer to the several vignettes and demos are provided in the package.
]]>
Building on the success of the previous conferences in 2009-2013, we expect more than 250 attendees from around the world. R users from industry, academia, and government will joining 30+ presenters covering all areas of finance with R.

We are very excited about the four keynote presentations given by Bob McDonald, Bill Cleveland, Alexios Ghalanos, and Luke Tierney. The main agenda (currently) includes 16 full presentations and 21 shorter "lightning talks". We are also excited to offer four optional pre-conference seminars on Friday morning.

The (optional) conference dinner will once-again be held at The Terrace at Trump Hotel. Overlooking the Chicago river and skyline, it is a perfect venue to continue conversations while dining and drinking.

More details of the agenda are available at:

http://www.RinFinance.com/agenda/

Registration information is available at:

http://www.RinFinance.com/register/

and can also be directly accessed by going to:

http://www.regonline.com/RFinance2014

We would to thank our 2014 Sponsors for the continued support enabling us to host such an exciting conference:

International Center for Futures and Derivatives at UIC

Revolution Analytics

MS-Computational Finance at University of Washington

OneMarketData

RStudio

TIBCO

SYMMS

paradigm4

On behalf of the committee and sponsors, we look forward to seeing you in Chicago!

Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson, Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich

]]>

If your strategy is not path-dependent, you can get a fairly substantial performance improvement by turning path-dependence off. If your strategy truly is path-dependent, keep reading...

I started working with Brian Peterson in late August of this year, and we've been working on a series of very large backtests over the last several weeks. Each backtest consisted of ~7 months of 5-second data on 72 instruments with 15-20 different configurations for each.

These backtests really pushed quantstrat to its limits. The longest-running job took 32 hours. I had some time while they were running, so I decided to profile quantstrat. I was able to make some substantial improvements, so I thought I'd write a post to tell you what's changed and highlight some of the performance gains we're seeing.

The biggest issue was how the internal function

instead of this:ruleFunction(c(50.04, 50.23, 50.42, 50.37, 50.24, 50.13, 50.12, 50.42, 50.42, 50.37, 50.24, 50.22, 49.95, 50.23, 50.26, 50.22, 50.11, 49.99, 50.12, 50.4, 50.33, 50.33, 50.18, 49.99), ...)

You can only imagine how large that first call would be for a 10-million-rowruleFunction(mktdata, ...)

If you think I would be happy enough with that, you don't know me. Several other changes helped get that 2-hour run down to under 30 minutes.

- We now calculate
**periodicity(mktdata)**in**applyRules**and pass that value to**ruleOrderProc**. This avoids re-calculating that value for every order, since**mktdata**doesn't change inside**applyRules**.

- We also pass current row index value to
**ruleOrderProc**and**ruleSignal**because xts subsetting via an integer is much faster than via a POSIXct object.

**applyStrategy**only accumulates values returned from**applyIndicators**,**applySignals**, and**applyRules**if**debug=TRUE**. This saves a little time, but can save a lot of memory for large**mktdata**objects.

- The dimension reduction algorithm has to look for the first time the price crosses the limit order price. We were doing that with a call to
**which(sigThreshold(...))[1]**. The relational operators (**<**,**<=**,**>**,, and**>**=**==**) and**which**operate on the entire vector, but we only need the first value, so I replaced that code with a with C-based**.firstThreshold**function that stops as soon as it finds the first cross.

All these changes are most significant for large data sets. The small demo strategies included with quantstrat are also faster, but the net performance gains increase as the size of the data, the number of signals (and therefore the number of rule evaluations), and number of instruments increases.

You're still reading? What are you waiting for? Go install the latest from R-Forge and try it for yourself!

]]>
You're still reading? What are you waiting for? Go install the latest from R-Forge and try it for yourself!

R/Finance 2014: Applied Finance with R

May 16 and 17, 2014

University of Illinois at Chicago

The sixth annual R/Finance conference for applied finance using R will be held on May 16 and 17, 2014 in Chicago, IL, USA at the University of Illinois at Chicago. The conference will cover topics including portfolio management, time series analysis, advanced risk tools, high-performance computing, market microstructure, and econometrics. All will be discussed within the context of using R as a primary tool for financial risk management, portfolio construction, and trading.

Over the past five years, R/Finance has included attendees from around the world. It has featured presentations from prominent academics and practitioners, and we anticipate another exciting line-up for 2014.

We invite you to submit complete papers in pdf format for consideration. We will also consider one-page abstracts (in txt or pdf format), although more complete papers are preferred. We welcome submissions for both full talks and abbreviated "lightning talks". Both academic and practitioner proposals related to R are encouraged.

Presenters are strongly encouraged to provide working R code to accompany the presentation/paper. Data sets should also be made public for the purposes of reproducibility (though we realize this may be limited due to contracts with data vendors). Preference may be given to presenters who have released R packages.

The conference will award two (or more) $1000 prizes for best papers. A submission must be a full paper to be eligible for a best paper award. Extended abstracts, even if a full paper is provided by conference time, are not eligible for a best paper award. Financial assistance for travel and accommodation may be available to presenters at the discretion of the conference committee. Requests for assistance should be made at the time of submission.

Please submit your papers or abstracts online at: goo.gl/OmKnu7. The submission deadline is January 31, 2014. Submitters will be notified via email by February 28, 2014 of acceptance, presentation length, and decisions on requested funding.

Additional details will be announced via the conference website www.RinFinance.com as they become available. Information on previous years' presenters and their presentations are also at the conference website.

For the program committee:

Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson,

Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich ]]>

The comments below are based on my personal experience. If I don't comment on a seminar or presentation, it doesn't mean I didn't like it or it wasn't good; it may have been over my head or I may have been distracted with my duties as a committee member. All the currently available conference slides are available on the website.

I went to (and live-tweeted) Jeff Ryan's seminar because I wanted to learn more about how he uses mmap+indexing with options data. There I realized that POSIXlt components use a zero-based index because they mirror the underlying tm struct, and that mmap+indexing files can be shared across cores and you can read them from other languages (e.g. Python).

The first presentation was by keynote Ryan Sheftel, who talked about how he uses R on his bond trading desk. David Ardia showed how expected returns can be estimated via the covariance matrix. Ronald Hochreiter gave an overview of modeling optimization via his modopt package. Tammer Kamel gave a live demo of the Quandl package and said, "Quandl hopes to do to Bloomberg what Wikipedia did to Britannica."

I had the pleasure of introducing both Doug Martin, who talked about robust covariance estimation, and Giles Heywood, who discussed several ways of estimating and forecasting covariance, and proposed an "open source equity risk and backtest system" as a means of matching talent with capital.

Ruey Tsay was the next keynote, and spoke about using principal volatility components to simplify multivariate volatility modeling. Alexios Ghalanos spoke about modeling multivariate time-varying skewness and kurtosis. Unfortunately, I missed both Kris Boudt's and David Matteson's presentations, but I did get to see Winston Chang's live demo of Shiny.

The two-hour conference reception at UIC was a great time to have a drink, talk with speakers, and say hello to people I had never met in person. Next was the (optional) dinner at The Terrace at Trump. Unfortunately, it was cold and windy, so we only spent 15-20 minutes on the terrace before moving inside. The food was fantastic, but the conversations were even better.

I missed the first block of lightning talks. Samantha Azzarello discussed her work with Blu Putnam, which used a dynamic linear model to evaluate the Fed's performance vis-a-vis the Taylor Rule. Jiahan Li used constrained least squares on 4 economic fundamentals to forecast foreign exchange rates. Thomas Harte talked about regulatory requirements of foreign exchange pricing (and wins the award for most slides, 270); basically documentation is important, Sweave to the rescue!

Sanjiv Das gave a keynote on 4 applications: 1) network analysis on SEC and FDIC filings to determine banks that pose systematic risk, 2) determining which home mortgage modification is optimal, 3) portfolio optimization with mental accounting, 4) venture capital communities.

I had the pleasure of introducing the following speakers: Dirk Eddelbuettel showed how it's easy to write fast linear algebra code with RcppArmadillo. Klaus Spanderen showed how to use QuantLib from R, and even how to to call C++ from R from C++. Bryan Lewis talked about SciDB and the scidb package (SciDB contains fast linear algebra routines that operate **on** the database!). Matthew Dowle gave an introduction to data.table (in addition to a seminar).

Attilio Meucci gave his keynote on visualizing advanced risk management and portfolio optimization. Immediately following, Brian Peterson gave a lightning on implementing Meucci's work in R (Attilio works in Matlab), which was part of a Google Summer of Code project last year.

Thomas Hanson presented his work with Don Chance (and others) on computational issues in estimating the volatility smile. Jeffrey Ryan showed how to manipulate options data in R with the greeks package.

The conference wrapped up by giving away three books, generously donated by Springer, to three random people who submitted feedback surveys. I performed the random drawing live on stage, using my patent-pending TUMC method (I tossed the papers up in the air).

The committee also presented the awards for best papers. The winners were:

*Regime switches in volatility and correlation of ﬁnancial institutions*, Boudt et. al.*A Bayesian interpretation of the Federal Reserve's dual mandate and the Taylor Rule*, Putnam & Azzarello*Nonparametric Estimation of Stationarity and Change Points in Finance*, Matteson et. al.*Estimating High Dimensional Covariance Matrix Using a Factor Model*, Sun (best student paper)

Last, but not least: none of this would be possible without the support of fantastic sponsors: International Center for Futures and Derivatives at UIC, Revolution Analytics, MS-Computational Finance at University of Washington, Google, lemnica, OpenGamma, OneMarketData, and RStudio.

Building on the success of the previous conferences in 2009, 2010, 2011 and 2012, we expect more than 250 attendees from around the world. R users from industry, academia, and government will joining 30+ presenters covering all areas of finance with R.

We are very excited about the four keynotes by Sanjiv Das, Attilio Meucci, Ryan Sheftel, and Ruey Tsay. The main agenda (currently) includes seventeen full presentations and fifteen shorter "lightning talks". We are also excited to offer five optional pre-conference seminars on Friday morning.

To celebrate the fifth year of the conference in style, the dinner will be held at The Terrace at Trump Hotel. Overlooking the Chicago river and skyline, it is a perfect venue to continue conversations while dining and drinking.

More details of the agenda are available at:

http://www.RinFinance.com/agenda/

Registration information is available at:

http://www.RinFinance.com/register/

and can also be directly accessed by going to:

http://www.regonline.com/RFinance2013

We would to thank our 2013 Sponsors for the continued support enabling us to host such an exciting conference:

International Center for Futures and Derivatives at UIC

Revolution Analytics

MS-Computational Finance at University of Washington

lemnica

OpenGamma

OneMarketData

RStudio

On behalf of the committee and sponsors, we look forward to seeing you in Chicago!

Gib Bassett, Peter Carl, Dirk Eddelbuettel, Brian Peterson, Dale Rosenthal, Jeffrey Ryan, Joshua Ulrich

]]>

There are some exciting new features, including a rolling single-factor model function (rollSFM, based on a prototype from James Toll), a runPercentRank function from Charlie Friedemann, stoch and WPR return 0.5 instead of NaN when there's insufficient price movement, and a faster aroon function.

Here are all of the updates (from the CHANGES file):

#-#-#-#-#-#-#-#-#-# Changes in TTR version 0.22-0 #-#-#-#-#-#-#-#-#-#

SIGNIFICANT USER-VISIBLE CHANGES

- CCI now returns an object with colnames ("cci").
- All moving average functions now attempt to set colnames.
- Added clarification on the displaced nature of DPO.
- SAR now sets the initial gap based on the standard deviation of the high-low range instead of hard-coding it at 0.01.

- Added rollSFM function that calculates alpha, beta, and R-squared for a single-factor model, thanks to James Toll for the prototype.
- Added runPercentRank function, thanks to Charlie Friedemann.
- Moved slowest portion of aroon to C.
- DonchianChannel gains an 'include.lag=FALSE' argument, which includes the current period's data in the calculation. Setting it to TRUE replicates the original calculation. Thanks to Garrett See and John Bollinger.
- The Stochastic Oscillator and Williams' %R now return 0.5 (instead of NaN) when a securities' price doesn't change over a sufficient period.
- All moving average functions gain '...'.
- Users can now change alpha in Yang Zhang volatility calculation.

- Fixed MACD when maType is a list. Now mavg.slow=maType[[2]] and mavg.fast=maType[[1]], as users expected based on the order of the nFast and nSlow arguments. Thanks to Phani Nukala and Jonathan Roy.
- Fixed bug in lags function, thanks to Michael Weylandt.
- Corrected error in Yang Zhang volatility calculation, thanks to several people for identifying this error.
- Correction to SAR extreme point calculations, thanks to Vamsi Galigutta.
- adjRatios now ensures all inputs are univariate, thanks to Garrett See.
- EMA and EVWMA now ensure n < number of non-NA values, thanks to Roger Bos.
- Fix to BBands docs, thanks to Evelyn Mitchell.

Join me in getting a good refresher on basic statistics, simulation and bootstrapping, linear algebra, and learning more about portfolio optimization, efficient portfolios, and risk budgeting. ]]>

Please try xtsExtra::plot.xts and let us know what you think. A sample of the eye-candy produced by the code in Michael's email is below. Granted, this isn't a one-liner, but it's certainly impressive! Great work Michael!

]]>

The book describes 6 approaches to distributed computing. Thoughts on each approach follow:

1) snow

The chapter starts by showing you how to create a socket cluster on a single machine (later sections discuss MPI clusters, and socket clusters of several machines). Then a section describes how to initialize workers, with a later section giving a slightly advanced discussion on how functions are serialized to workers.2) multicore

There's a great demonstration (including graphs) of why/when you should use clusterApplyLB instead of clusterApply. There's also a fantastic discussion on potential I/O issues (probably one of the most surprising/confusing issues to people new to distributed computing) and how parApply handles them. Then the authors provide a very useful parApplyLB function.

There are a few (but very important!) paragraphs on random number generation using the rsprng and rlecuyer packages.

The chapter starts by noting that the multicore package only works on a single computer running a POSIX compliant operating system (i.e. most anything3) parallel (comes with R >= 2.14.0)exceptWindows).

The next section describes the mclapply function, and also explains how mclapply creates a cluster each time it's called, why this isn't a speed issue, and how it is actually beneficial. The next few sections describe some of the optional mclapply arguments, and how you can achieve load balancing with mclapply. A good discussion of pvec, parallel, and collect functions follow.

There are some great tips on how to use the rsprng and rlecuyer packages for random number generation, even though they aren't directly supported by the multicore package. The chapter concludes with a short, but effective, description of multicore's low-level API.

The chapter starts by noting that the parallel package is a combination of the snow and multicore packages. This chapter is relatively short, since those two packages were covered in detail over the prior two chapters. Most of the content discusses the implementation differences between parallel and snow/multicore.4) R+Hadoop

There's a full chapter primer on Hadoop and MapReduce, for those who aren't familiar with the software and concept. The chapter ends with an introduction to Amazon's EC2 and EMR services, which significantly lower the barrier to using Hadoop.5) RHIPE

The chapter on R+Hadoop is very little R and mostly Hadoop. This is because Hadoop requires more setup than the other approaches. You will need to do some work on the command line and with environment variables.

There are three examples; one Hadoop streaming and two using the Java API (which require writing/modifying some Java code). The authors take care to describe each block of code in all the examples, so it's accessible to those who haven't written Java.

Using three examples, this chapter provides a thorough treatment of how to use RHIPE to abstract-away a lot of the boilerplate code required for Hadoop. Everything is done in R. As with the Hadoop chapter, the authors describe each block of code.6) segue

RHIPE does require a little setup: it must be installed on your workstation and all cluster nodes. In the examples, the authors describe how RHIPE allows you to transfer R objects between Map and Reduce phases, and they mention the RHIPE functions you can use to manipulate HDFS data.

This is a very short chapter because the segue package has very narrow scope: using Amazon's EMR service in two lines of code!Final thoughts:

I would recommend this book to someone who is looking to move beyond the most basic distributed computing solutions. The authors are careful to point you in the right direction and warn you of potential pitfalls of each approach.

All but the most basic setups (e.g. a socket cluster on a single machine) will require some familiarity with the command line, environment variables, and networking. This isn't the fault of the authors or any of the approaches... parallel computing just isn't that easy.

I really expected to see something on using foreach, especially since Stephen Weston has done work on those packages. It is mentioned briefly at the end of the book, so maybe it will appear in later editions.

]]>