Welcome!#

About the Book#

This book is a work in process!

This book is a collection of data wrangling problems and solutions tailed for business school students. It is designed as a cookbook for business school students who have a particular data wrangling problem in hand and want to get the job done quickly. Meanwhile, I also tried my best to explain each line of code so that in addition to copy-paste, you can still learn useful techniques :)

Currently, this book has three chapters in Part I. All of them focus on R (I’ll cover Python in the future):

  • An Introduction to data.table (TBD). A gentle introduction of data.table. I’ll show you why data.table is more efficient for data wrangling problems than its competitor dplyr.

  • A Highly Efficient Event Study Code. Event study is almost every in business researches. I offered a super effient event study in R. The core part has only 30+ lines, and it’s 5x to 10x faster than the Python version offered by WRDS.

  • 40 Practices on Stock Data Processing (Recommend). A collection of 40+ problems that you’ll frequently encountered when dealing with stock data. You can use this chapter either as a cookbook or as a learning-by-doing textbook of data.table. The chapter is an English version of this Github repo, which is authored by me and Rui Li (Zhejiang Financial College). Special thanks to renkun-ken who provides the original problem set.

  • Useful Functions. A collect of useful & efficient functions. Most of them are faster rewrites of functions offered in other packages. For example, I offered a faster function drawdown to compute the largest drawdowns of an asset, which is 1.89x faster than table.Drawdowns from the PerformanceAnalytics package.

If you have any questions, please contact me at: ross dot zhu at outlook dot com.