Advances in Financial Machine Learning
:1–393 (
2018)
Copy
BIBTEX
Abstract
"Machine learning (ML) is changing virtually every aspect of our lives. Today ML algorithms accomplish tasks that until recently only expert humans could perform. As it relates to finance, this is the most exciting time to adopt a disruptive technology that will transform how everyone invests for generations. Readers will learn how to structure Big data in a way that is amenable to ML algorithms; how to conduct research with ML algorithms on that data; how to use supercomputing methods; how to backtest your discoveries while avoiding false positives. The book addresses real-life problems faced by practitioners on a daily basis, and explains scientifically sound solutions using math, supported by code and examples. Readers become active users who can test the proposed solutions in their particular setting. Written by a recognized expert and portfolio manager, this book will equip investment professionals with the groundbreaking tools needed to succeed in modern finance"– "This book begins by structuring financial data in a way that is amenable to machine learning (ML) algorithms. Then, the author discusses how to conduct research with ML algorithms on that data and how to backtest your discoveries. Most of the problems and solutions are explained using math, supported by code. This makes the book very practical and hands-on. Readers become active users who can test the solutions proposed in their work. Readers will learn how to structure, label, weight, and backtest data. Machine learning is the future, and this book will equip investment professionals with the tools to utilize it moving forward"– Intro; Advances in Financial Machine Learning; Contents; About the Author; Preamble; 1 Financial Machine Learning as a Distinct Subject; 1.1 Motivation; 1.2 The Main Reason Financial Machine Learning Projects Usually Fail; 1.2.1 The Sisyphus Paradigm; 1.2.2 The Meta-Strategy Paradigm; 1.3 Book Structure; 1.3.1 Structure by Production Chain; 1.3.2 Structure by Strategy Component; 1.3.3 Structure by Common Pitfall; 1.4 Target Audience; 1.5 Requisites; 1.6 FAQs; 1.7 Acknowledgments; Exercises; References; Bibliography; PART 1 Data Analysis; 2 Financial Data Structures; 2.1 Motivation. 2.2 Essential Types of Financial Data2.2.1 Fundamental Data; 2.2.2 Market Data; 2.2.3 Analytics; 2.2.4 Alternative Data; 2.3 Bars; 2.3.1 Standard Bars; 2.3.2 Information-Driven Bars; 2.4 Dealing with Multi-Product Series; 2.4.1 The ETF Trick; 2.4.2 PCA Weights; 2.4.3 Single Future Roll; 2.5 Sampling Features; 2.5.1 Sampling for Reduction; 2.5.2 Event-Based Sampling; Exercises; References; 3 Labeling; 3.1 Motivation; 3.2 The Fixed-Time Horizon Method; 3.3 Computing Dynamic Thresholds; 3.4 The Triple-Barrier Method; 3.5 Learning Side and Size; 3.6 Meta-Labeling; 3.7 How to Use Meta-Labeling. 3.8 The Quantamental Way3.9 Dropping Unnecessary Labels; Exercises; Bibliography; 4 Sample Weights; 4.1 Motivation; 4.2 Overlapping Outcomes; 4.3 Number of Concurrent Labels; 4.4 Average Uniqueness of a Label; 4.5 Bagging Classifiers and Uniqueness; 4.5.1 Sequential Bootstrap; 4.5.2 Implementation of Sequential Bootstrap; 4.5.3 A Numerical Example; 4.5.4 Monte Carlo Experiments; 4.6 Return Attribution; 4.7 Time Decay; 4.8 Class Weights; Exercises; References; Bibliography; 5 Fractionally Differentiated Features; 5.1 Motivation; 5.2 The Stationarity vs. Memory Dilemma; 5.3 Literature Review. 5.4 The Method5.4.1 Long Memory; 5.4.2 Iterative Estimation; 5.4.3 Convergence; 5.5 Implementation; 5.5.1 Expanding Window; 5.5.2 Fixed-Width Window Fracdiff; 5.6 Stationarity with Maximum Memory Preservation; 5.7 Conclusion; Exercises; References; Bibliography; PART 2 Modelling; 6 Ensemble Methods; 6.1 Motivation; 6.2 The Three Sources of Errors; 6.3 Bootstrap Aggregation; 6.3.1 Variance Reduction; 6.3.2 Improved Accuracy; 6.3.3 Observation Redundancy; 6.4 Random Forest; 6.5 Boosting; 6.6 Bagging vs. Boosting in Finance; 6.7 Bagging for Scalability; Exercises; References; Bibliography. 7 Cross-Validation in Finance7.1 Motivation; 7.2 The Goal of Cross-Validation; 7.3 Why K-Fold CV Fails in Finance; 7.4 A Solution: Purged K-Fold CV; 7.4.1 Purging the Training Set; 7.4.2 Embargo; 7.4.3 The Purged K-Fold Class; 7.5 Bugs in Sklearns Cross-Validation; Exercises; Bibliography; 8 Feature Importance; 8.1 Motivation; 8.2 The Importance of Feature Importance; 8.3 Feature Importance with Substitution Effects; 8.3.1 Mean Decrease Impurity; 8.3.2 Mean Decrease Accuracy; 8.4 Feature Importance without Substitution Effects; 8.4.1 Single Feature Importance; 8.4.2 Orthogonal Features.