Arisoy, Sainath, Kingsbury, et al. 2012. “Deep Neural Network Language Models.” In Proceedings of the NAACL-HLT 2012 Workshop: Will We Ever Really Replace the N-Gram Model? On the Future of Language Modeling for HLT. WLM ’12.
Autebert, Berstel, and Boasson. 1997.
“Context-Free Languages and Pushdown Automata.” In
Handbook of Formal Languages, Vol. 1.
Baeza-Yates, and Ribeiro-Neto. 1999. Modern Information Retrieval.
Bender, Gebru, McMillan-Major, et al. 2021.
“On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜.” In
Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency.
Bengio, Ducharme, Vincent, et al. 2003.
“A Neural Probabilistic Language Model.” Journal of Machine Learning Research.
Berstel, and Boasson. 1990.
“Transductions and Context-Free Languages.” In
Handbook of Theoretical Computer Science, Vol. A: Algorithms and Complexity.
Blazek, and Lin. 2020.
“A Neural Network Model of Perception and Reasoning.” arXiv:2002.11319 [Cs, q-Bio].
Bolhuis, Tattersall, Chomsky, et al. 2014.
“How Could Language Have Evolved?” PLoS Biol.
Booth, and Thompson. 1973.
“Applying Probability Measures to Abstract Languages.” IEEE Transactions on Computers.
Brown, Mann, Ryder, et al. 2020.
“Language Models Are Few-Shot Learners.” arXiv:2005.14165 [Cs].
Casacuberta, and de la Higuera. 2000.
“Computational Complexity of Problems on Probabilistic Grammars and Transducers.” In
Grammatical Inference: Algorithms and Applications.
Charniak. 1996. Statistical Language Learning.
Clark, Alexander, and Eyraud. 2005.
“Identification in the Limit of Substitutable Context-Free Languages.” In
Algorithmic Learning Theory. Lecture Notes in Computer Science.
Clark, Alexander, Florêncio, and Watkins. 2006.
“Languages as Hyperplanes: Grammatical Inference with String Kernels.” In
Machine Learning: ECML 2006. Lecture Notes in Computer Science 4212.
Clark, Alexander, Florêncio, Watkins, et al. 2006.
“Planar Languages and Learnability.” In
Grammatical Inference: Algorithms and Applications. Lecture Notes in Computer Science 4201.
Collins, and Duffy. 2002.
“Convolution Kernels for Natural Language.” In
Advances in Neural Information Processing Systems 14.
Gold. 1967.
“Language Identification in the Limit.” Information and Control.
Gonzalez, and Thomason. 1978. Syntactic Pattern Recognition: An Introduction.
Grefenstette, Hermann, Suleyman, et al. 2015.
“Learning to Transduce with Unbounded Memory.” arXiv:1506.02516 [Cs].
Hopcroft, and Ullman. 1979. Introduction to Automata Theory, Languages and Computation.
Khalifa, Barros, and Togelius. 2019.
“DeepTingle.”
Kleinberg, and Mullainathan. 2024.
“Language Generation in the Limit.”
Kontorovich, Leonid, Cortes, and Mohri. 2006.
“Learning Linearly Separable Languages.” In
Algorithmic Learning Theory. Lecture Notes in Computer Science 4264.
Kontorovich, Leonid (Aryeh), Cortes, and Mohri. 2008.
“Kernel Methods for Learning Languages.” Theoretical Computer Science, Algorithmic Learning Theory,.
Lafferty, McCallum, and Pereira. 2001.
“Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data.” In
Proceedings of the Eighteenth International Conference on Machine Learning. ICML ’01.
Manning. 2002. “Probabilistic Syntax.” In Probabilistic Linguistics.
Manning, Raghavan, and Schütze. 2008. Introduction to Information Retrieval.
Manning, and Schütze. 1999. Foundations of Statistical Natural Language Processing.
Mikolov, Tomáš, Karafiát, Burget, et al. 2010.
“Recurrent Neural Network Based Language Model.” In
Eleventh Annual Conference of the International Speech Communication Association.
Mikolov, Tomas, Le, and Sutskever. 2013.
“Exploiting Similarities Among Languages for Machine Translation.” arXiv:1309.4168 [Cs].
Mitra, and Craswell. 2017.
“Neural Models for Information Retrieval.” arXiv:1705.01509 [Cs].
Mohri, Pereira, and Riley. 1996.
“Weighted Automata in Text and Speech Processing.” In
Proceedings of the 12th Biennial European Conference on Artificial Intelligence (ECAI-96), Workshop on Extended Finite State Models of Language.
Pennington, Socher, and Manning. 2014.
“GloVe: Global Vectors for Word Representation.” Proceedings of the Empiricial Methods in Natural Language Processing (EMNLP 2014).
Salakhutdinov. 2015.
“Learning Deep Generative Models.” Annual Review of Statistics and Its Application.
Schlag, and Schmidhuber. 2019.
“Learning to Reason with Third-Order Tensor Products.” arXiv:1811.12143 [Cs, Stat].
Solan, Horn, Ruppin, et al. 2005.
“Unsupervised Learning of Natural Languages.” Proceedings of the National Academy of Sciences of the United States of America.
Wolff. 2000. “Syntax, Parsing and Production of Natural Language in a Framework of Information Compression by Multiple Alignment, Unification and Search.” Journal of Universal Computer Science.