Explanatory and Creative Alternatives to the MDL priciple

Hernández-Orallo, José; García-Varea, Ismael

doi:10.1023/A:1011350914776

Explanatory and Creative Alternatives to the MDL priciple

Published: 01 June 2000

Volume 5, pages 185–207, (2000)
Cite this article

Foundations of Science Aims and scope Submit manuscript

José Hernández-Orallo¹ &
Ismael García-Varea²

88 Accesses
3 Citations
Explore all metrics

Abstract

The Minimum Description Length (MDL) principle is the modernformalisation of Occam's razor. It has been extensively and successfullyused in machine learning (ML), especially for noisy and long sources ofdata. However, the MDL principle presents some paradoxes andinconveniences. After discussing all these, we address two of the mostrelevant: lack of explanation and lack of creativity. We present newalternatives to address these problems. The first one, intensionalcomplexity, avoids extensional parts in a description, so distributingcompression ratio in a more even way than the MDL principle. The secondone, information gain, forces that the hypothesis is informative (orcomputationally hard to discover) wrt. the evidence, so giving a formaldefinition of what is to discover.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

REFERENCES

Angluin, D.: 1988, Queries and Concept Learning. Machine Learning 2(4): 319–342.
Google Scholar
Barker, S.F.: 1957, Induction and Hypothesis. Ithaca.
Bar-Hillel, Y. and R. Carnap: 1953, Semantic Information. British J. for the Philosophy of Science 4: 147–157.
Article Google Scholar
Barron, A., J. Rissanen and B. Yu: 1998, TheMinimum Description Length Principle in Coding andModeling. IEEE Transactions on Information Theory 44(6): 2743–2760.
Article Google Scholar
Blum, M.: 1967, A Machine-Independent Theory of the Complexity of Recursive functions, J. ACM 14(4): 322–326.
Article Google Scholar
Blum, L. and M. Blum: 1975, Towards a Mathematical Theory of Inductive Inference. Inform. and Control 28: 125–155.
Article Google Scholar
Blumer, A., A. Ehrenfeucht, D. Haussler and M. Warmuth: 1989, Learnability and the Vapnik-Chervonenkis Dimension. Journal of ACM 36: 929–965.
Article Google Scholar
Board, R. and L. Pitt: 1990, On the Necessity of Occam Algorithms, in Proc., 22nd ACM Symp. Theory of Comp.
Bosch, van den: 1994, Simplicity and Prediction, Master Thesis, dep. of Science, Logic & Epistemology of the Faculty of Philosophy at the Univ. of Groningen.
Case J. and C. Smith: 1983, Comparison of Identification Criteria for Machine Inductive Inference. Theoret. Comput. Sci. 25: 193–220.
Article Google Scholar
Cheeseman, P.: 1990, On Finding the Most Probable Model. In J. Shrager and P. Langley (eds.), Computational Models of Scientific Discovery and Theory Formation. Morgan Kaufmann.
Conklin, D. and I.H. Witten: 1994, Complexity-Based Induction. Machine Learning 16: 203–225.
Google Scholar
Derthick, M.: 1990, The Minimum Description Length Principle Applied to Feature Learning and Analogical Mapping, MCC Tech. Rep. no. ACT-CYC-234-90.
Ernis, R.: 1968, Enumerative Induction and Best Explanation. J. Philosophy LXV(18): 523–529.
Google Scholar
Freivalds, R., E. Kinber and C.H. Smith: 1995, On the Intrinsic Complexity of Learning. Information and Control 123: 64–71.
Google Scholar
Gold, E.M.: 1967, Language Identification in the Limit. Information & Control 10: 447–474.
Article Google Scholar
Grünwald, P.: 1999, Model Selection Based on Minimum Description Length, submitted to Journal of Mathematical Psychology. Amsterdam: CWI.
Google Scholar
Gull, S.F.: 1988, Bayesian Inductive Inference and Maximum Entropy. In G.J. Erickson and C.R. Smith (eds.), Maximum Entropy and Bayesian Methods in Science and Engineering Vol. 1 Foundations. Dordrecht: Kluwer, 53–74.
Chapter Google Scholar
Harman, G.: 1965, The Inference to the Best Explanation. Philos. Review 74: 88–95.
Article Google Scholar
Hempel, C.G.: 1965, Aspects of Scientific Explanation. New York: The Free Press.
Google Scholar
Hernandez-Orallo, J.: 1999a, Constructive Reinforcement Learning, International Journal of Intelligent Systems, vol. 15, no. 3, pp. 241–264, 2000.
Article Google Scholar
Hernandez-Orallo, J.: 1999b, What is a subprogram?, submitted.
Hernandez-Orallo, J. and I. Garcia-Varea: 1998, Distinguishing Abduction and Induction Under Intensional Complexity. In A.I. Flach and P.A. Kakas (eds.), Proceedings of the ECAI'98 Workshop on Abduction and Induction Brighton, 41–48.
Hintikka, J., 1970, Surface Information and Depth Information. In J. Hintikka and P. Suppes (eds.), Information and Inference. D. Reidel Publishing Company, 263–297.
Kearns, M., Y. Mansour, A.Y. Ng and D. Ron: 1999, An Experimental and Theoretical Comparison of Model Selection Methods. Machine Learning, to appear.
Kuhn, T.S.: 1970, The Structure of Scientific Revolutions. University of Chigago.
Levin, L.A.: 1973, Universal Search Problems. Problems Inform. Transmission 9: 265–266.
Google Scholar
Li, M. and P. Vitanyi: 1997, An Introduction to Kolmogorov Complexity and its Applications, 2nd Ed. Springer-Verlag.
Merhav, N. and M. Feder: 1998, Universal Prediction. IEEE Transactions on Information Theory 44(6): 2124–2147.
Article Google Scholar
Muggleton, S., A. Srinivasan and M. Bain: 1992, Compression, Significance and Accuracy. In D. Sleeman and P. Edwards (eds.), Machine Learning: Proc. of the 9th Intl Conf (ML92), Wiley, 523–527.
Muggleton, S. and L. De Raedt: 1994, Inductive Logic Programming - theory and methods. J. of Logic Prog. 19-20: 629–679.
Article Google Scholar
Pfahringer, B.: 1994, Controlling Constructive Induction in CiPF: an MDL Approach. In F. Bergadano and L. de Raedt (eds.), Machine Learning, Proc. of the European Conf. on Machine Learning (ECML-94), LN AI 784, Springer-Verlag, 242–256.
Popper, K.R.: 1962, Conjectures and Refutations: The Growth of Scientific Knowledge. New York: Basic Books.
Google Scholar
Quinlan, J. and R. Rivest: 1989, Inferring Decision Trees Using the Minimum Description Length Principle. Information and Computation 80: 227–248.
Article Google Scholar
Rissanen, J.: 1978, Modeling by the Shortest Data Description. Automatica-J.IFAC 14: 465–471.
Article Google Scholar
Rissanen, J.: 1986, Stochastic Complexity and Modeling. Annals Statist. 14: 1080–1100.
Article Google Scholar
Rissanen, J.: 1996, Fisher Information and Stochastic Complexity. IEEE Trans. on Information Theory 42(1).
Rivest, R.L. and R. Sloan: 1994,A Formal Model of Hierarchical Concept Learning. Inf. and Comp. 114: 88–114.
Article Google Scholar
Schaffer, C.: 1994, A Conservation Law for Generalization Performance, in Proc. of the 11th Intl. Conf. on Machine Learning, 259–265.
Sharger, J. and P. Langley: 1990, Computational Models of Scientific Discovery and Theory Formation. Morgan Kaufmman.
Solomonoff, R.J.: 1964, A Formal Theory of Inductive Inference, Inf. Control 7: 1-22, Mar., 224–254, June.
Article Google Scholar
Solomonoff, R.J.: 1978, Complexity-Based Induction Systems: Comparisons and Convergence Theorems. IEEE Trans. Inform. Theory IT-24: 422–432.
Article Google Scholar
Valiant, L.: 1984, A Theory of the Learnable. Comm. of the ACM 27(11): 1134–1142.
Article Google Scholar
Vitányi, P. and M. Li: 1996, Minimum Description Length Induction, bayesianism, and Kolmogorov complexity. Manuscript, CWI, Amsterdam, September 1996, Submitted to: IEEE Trans. Inform. Theory. URL: http://www. cwi.nl/_paulv/selection.html.
Vitányi, P. and M. Li: 1997, On Prediction by Data Compression, in: Proc. of the 9th European Conf. on Machine Learning, LNAI 1224, Springer-Verlag, 14–30.
Google Scholar
Wallace, C.S. and D.M. Boulton: 1968, An Information Measure for Classification. Computing Journal 11: 185–195.
Article Google Scholar
Watanabe, S.: 1972, Pattern Recognition as Information Compression. In Watanabe (ed.), Frontiers of Pattern Recognition. New York: Academic Press.
Google Scholar
Wolff, J.G.: 1995, Computing as Compression: An Overview of the SP Theory and System. New Gen. Computing 13: 187–214.
Article Google Scholar
Wolpert, D.: 1992, On the Connection Between In-sample Testing and Generalization Error. Complex Systems 6: 47–94.
Google Scholar
Zemel, R.: 1993, A Minimum Description Length Framework for Unsupervised Learning. Ph.D. Thesis, Dept. of Computer Science, Univ. of Toronto.

Download references

Author information

Authors and Affiliations

Departament de Sistemes Informàtics i Computació, Universitat Politècnica de València, Camí de Vera s/n, E-46022, València, Spain
José Hernández-Orallo
Institut Tecnològic d'Informàtica, Universitat Politècnica de València, Camí de Vera s/n, E-46022, València, Spain
Ismael García-Varea

Authors

José Hernández-Orallo
View author publications
You can also search for this author in PubMed Google Scholar
Ismael García-Varea
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hernández-Orallo, J., García-Varea, I. Explanatory and Creative Alternatives to the MDL priciple. Foundations of Science 5, 185–207 (2000). https://doi.org/10.1023/A:1011350914776

Download citation

Published: 01 June 2000
Issue Date: June 2000
DOI: https://doi.org/10.1023/A:1011350914776

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Explanatory and Creative Alternatives to the MDL priciple

Abstract

Access this article

Similar content being viewed by others

Natural Descriptions and Anthropic Bias: Extant Problems In Solomonoff Induction

In Search of Machine Learning Theory

A General-Purpose Machine Reasoning Engine

REFERENCES

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Explanatory and Creative Alternatives to the MDL priciple

Abstract

Access this article

Similar content being viewed by others

Natural Descriptions and Anthropic Bias: Extant Problems In Solomonoff Induction

In Search of Machine Learning Theory

A General-Purpose Machine Reasoning Engine

REFERENCES

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation