On Deep Learning Approaches for Single-Turn Text-to-SQL Parsing: Concepts, Methods, and Future Directions
Abstract:
This literature review delves into deep learning
techniques for text-to-SQL parsing, exploring datasets, evaluation metrics,
models, and methodologies. The study aims to provide a comprehensive overview
of the field, analyzing strengths, weaknesses, accuracy, practical
applications, and scalability of various approaches. By examining the current
landscape and future directions, this work serves as a valuable resource for
researchers, industry professionals, and enthusiasts interested in Natural
Language Processing and neural semantic parsing.
References:
[1]. Peng, H., Li,
G., Zhao, Y. and Jin, Z., 2022, Rethinking Positional Encoding in Tree
Transformer for Code Representation, Conference on Empirical Methods in
Natural Language Processing, 3204–14.
[2]. He, P., Liu, X.,
Gao, J. and Chen, W., 2021, Deberta: Decoding-Enhanced Bert with Disentangled
Attention.
[3]. Vinyals, O.,
Fortunato, M. and Jaitly, N., 2015, Pointer Networks, Advances in Neural
Information Processing Systems, p. 2692–700.
[4]. Kool, W., Van
Hoof, H. and Welling, M., 2019, Attention, Learn To Solve Routing Problems! ICLR,.
[5]. Gu, J., Lu, Z.,
Li, H. and Li, V. O. K., 2016, Incorporating Copying Mechanism in
Sequence-to-Sequence Learning.
[6]. Sutskever, I.,
Vinyals, O. and Le, Q. V., 2014, Sequence to Sequence Learning with Neural
Networks, Advances in Neural Information Processing Systems, p. 3104–12.
[7]. Ba, J. L.,
Kiros, J. R. and Hinton, G. E., 2016, Layer Normalization.
[8]. Galassi, A.,
Lippi, M. and Torroni, P., 2021, Attention in Natural Language Processing, IEEE
Transactions on Neural Networks and Learning Systems, Institute of
Electrical and Electronics Engineers Inc, 32, 4291–308,
https://doi.org/10.1109/TNNLS.2020.3019893.
[9]. Shaw, P.,
Uszkoreit, J. and Vaswani, A., 2018, Self-Attention with Relative Position
Representations, NAACL HLT 2018 - 2018 Conference of the North American
Chapter of the Association for Computational Linguistics: Human Language
Technologies - Proceedings of the Conference, p. 464–8,
https://doi.org/10.18653/v1/n18-2074.
[10]. Jia, W., Dai,
D., Xiao, X. and Wu, H., 2020, Arnor: Attention Regularization Based Noise
Reduction for Distant Supervision Relation Classification, ACL 2019 - 57th
Annual Meeting of the Association for Computational Linguistics, Proceedings of
the Conference, p. 1399–408, https://doi.org/10.18653/v1/p19-1135.
[11]. Zehui, L., Liu,
P., Huang, L., Chen, J., Qiu, X. and Huang, X., 2019, DropAttention: A
Regularization Method for Fully-Connected Self-Attention Networks.
[12]. He, Q., Sedoc,
J. and Rodu, J., 2021, Trees in Transformers: A Theoretical Analysis of the
Transformer’s Ability to Represent Trees.
[13]. Shiv, V. L. and
Quirk, C., 2019, Novel Positional Encodings to Enable Tree-Based Transformers, Advances
in Neural Information Processing Systems.
[14]. Kitaev, N.,
Kaiser, Ł. and Levskaya, A., 2020, Reformer: The Efficient Transformer, 8th
International Conference on Learning Representations, ICLR 2020.
[15]. Vaswani, A.,
Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N. et al., 2017,
Attention is all You Need, Advances in Neural Information Processing Systems,
p. 5999–6009.
[16]. OpenAI, :,
Achiam, J., Adler, S., Agarwal, S., Ahmad, L. et al., 2023, GPT-4 Technical
Report, 4, 1–100.
[17]. Yin, P., Neubig,
G., Yih, W. and Riedel, S., 2020, TaBERT: Pretraining for Joint Understanding
of Textual and Tabular Data, Association for Computational Linguistics (ACL),
8413–26, https://doi.org/10.48550/arxiv.2005.08314.
[18]. Clark, K.,
Luong, M. - T., Le, Q. V. and Manning, C. D., 2020, ELECTRA: Pre-training Text
Encoders as Discriminators Rather than Generators.
[19]. Paszke, A.,
Gross, S., Massa, F., Lerer, A., Bradbury Google, J., Chanan, G. et al., 2019,
PyTorch: An Imperative Style, High-Performance Deep Learning Library.
[20]. Manning, C. D.,
Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. J. and Mcclosky, D., 2014, The
Stanford CoreNLP Natural Language Processing Toolkit.
[21]. Pennington, J.,
Socher, R. and Manning, C. D. ,2014, GloVe: Global Vectors for Word
Representation [Internet].
[22]. Finegan-Dollak,
C., Kummerfeld, J. K., Zhang, L., Ramanathan, K., Sadasivam, S., Zhang, R. et
al., 2018, Improving Text-to-SQL Evaluation Methodology.
https://doi.org/10.18653/v1/P18-1033
[23]. Li, H., Zhang,
J., Li, C. and Chen, H., 2023, RESDSQL: Decoupling Schema Linking and Skeleton
Parsing for Text-to-SQL.
[24]. Cai, R., Yuan,
J., Xu, B. and Hao, Z., 2021, SADGA: Structure-Aware Dual Graph Aggregation
Network for Text-to-SQL.
[25]. Tu, Z., Lu, Z.,
Yang, L., Liu, X. and Li, H., 2016, Modeling Coverage for Neural Machine
Translation, 54th Annual Meeting of the Association for Computational
Linguistics, ACL 2016 - Long Papers, p. 76–85,
https://doi.org/10.18653/v1/p16-1008.
[26]. Luong, M.-T.,
Sutskever, I., Le, Q. V., Vinyals, O. and Zaremba, W., 2014, Addressing the
Rare Word Problem in Neural Machine Translation.
[27]. Bahdanau, D.,
Cho, K. H. and Bengio, Y., 2015, Neural Machine Translation by Jointly Learning
to Align and Translate, 3rd International Conference on Learning
Representations, ICLR 2015 - Conference Track Proceedings.
[28]. Luong, M. T.,
Pham, H. and Manning, C. D., 2015, Effective Approaches to Attention-Based
Neural Machine Translation, Conference Proceedings - EMNLP 2015: Conference
on Empirical Methods in Natural Language Processing, p. 1412–21,
https://doi.org/10.18653/v1/d15-1166.
[29]. Schuster, M. and
Paliwal, K. K., 1997, Bidirectional Recurrent Neural Networks, IEEE
Transactions on Signal Processing, 45,
https://doi.org/10.1109/78.650093.
[30]. Hochreiter, S.
and Schmidhuber, J., 1997, Long Short-Term Memory. Neural Computation, 9,
https://doi.org/10.1162/neco.1997.9.8.1735.
[31]. Cho, K., van
Merriënboer, B., Bahdanau, D. and Bengio, Y., 2014, On the Properties of Neural
Machine Translation: Encoder–Decoder Approaches, Proceedings of SSST 2014 -
8th Workshop on Syntax, Semantics and Structure in Statistical Translation,
https://doi.org/10.3115/v1/w14-4012.
[32]. John S., Bridle,
1990, Training Stochastic Model Recognition Algorithms as Networks Can Lead to
Maximum Mutual Information Estimation of Parameters, Advances in Neural
Information Processing Systems, 2, 211–7.
[33]. Zelle, J. M.,
Moines, D. and Mooney, J., 1996, Learning to Parse Database Queries Using
Inductive Logic Programming.
[34]. Tang, L. R. and
Mooney, R. J., 1996, Automated Construction of Database Interfaces :
Integrating Statistical and Relational Learning for Semantic Parsing.
[35]. Iyer, S.,
Konstas, I., Cheung, A., Krishnamurthy, J. and Zettlemoyer, L., 2017, Learning
A Neural Semantic Parser From User Feedback, ACL 2017 - 55th Annual Meeting
of the Association for Computational Linguistics, Proceedings of the Conference
(Long Papers), 1, 963–73, https://doi.org/10.18653/v1/P17-1089.
[36]. Zhong, V.,
Xiong, C. and Socher, R., 2017, Seq2SQL: Generating Structured Queries from
Natural Language using Reinforcement Learning.
[37]. Utama, P., Weir,
N., Basik, F., Binnig, C., Cetintemel, U., Hättasch, B., et al., 2018, An End-to-end
Neural Natural Language Interface for Databases.
[38]. Yu, T., Zhang,
R., Yang, K., Yasunaga, M., Wang, D., Li, Z. et al, 2018, Spider: A Large-Scale
Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and
Text-to-SQL Task.
[39]. Min, Q., Shi, Y.
and Zhang, Y., 2019 A Pilot Study for Chinese SQL Semantic Parsing, EMNLP-IJCNLP
2019 - 2019 Conference on Empirical Methods in Natural Language Processing and
9th International Joint Conference on Natural Language Processing, Proceedings
of the Conference, 3652–8, https://doi.org/10.18653/v1/d19-1377.
[40]. Shaw, P., Chang,
M. W., Pasupat, P. and Toutanova, K., 2021, Compositional Generalization and
Natural Language Variation: Can a Semantic Parsing Approach Handle Both? ACL-IJCNLP
2021 - 59th Annual Meeting of the Association for Computational Linguistics and
the 11th International Joint Conference on Natural Language Processing,
Proceedings of the Conference, 922–38,
https://doi.org/10.18653/v1/2021.acl-long.75.
[41]. Gan, Y., Chen,
X., Huang, Q., Purver, M., Woodward, J. R., Xie, J. et al, 2021, Towards
Robustness of Text-to-SQL Models Against Synonym Substitution, ACL-IJCNLP
2021 - 59th Annual Meeting of the Association for Computational Linguistics and
the 11th International Joint Conference on Natural Language Processing,
Proceedings of the Conference, 2505–15,
https://doi.org/10.18653/v1/2021.acl-long.195.
[42]. Gan, Y., Chen,
X. and Purver, M., 2021, Exploring Underexplored Limitations of Cross-Domain
Text-to-SQL Generalization, EMNLP 2021 - 2021 Conference on Empirical
Methods in Natural Language Processing, Proceedings, 8926–31,
https://doi.org/10.18653/v1/2021.emnlp-main.702.
[43]. Yu, X., Chen,
T., Yu, Z., Li, H., Yang, Y., Jiang, X. et al, 2020, Dataset and Enhanced Model
for Eligibility Criteria-to-SQL Semantic Parsing, LREC 2020 - 12th
International Conference on Language Resources and Evaluation, Conference
Proceedings, 5829–37.
[44]. Shi, T., Zhao,
C., Boyd-Graber, J., Daumé, H. and Lee, L., 2020, On the Potential of
Lexico-Logical Alignments for Semantic Parsing to SQL Queries, Findings of
the Association for Computational Linguistics Findings of ACL: EMNLP 2020,
1849–64, https://doi.org/10.18653/v1/2020.findings-emnlp.167.
[45]. Suhr, A., Chang,
M. W., Shaw, P. and Lee, K., 2020, Exploring Unexplored Generalization
Challenges for Cross-Database Semantic Parsing, Proceedings of the Annual
Meeting of the Association for Computational Linguistics, 8372–88,
https://doi.org/10.18653/v1/2020.acl-main.742.
[46]. Tomova, M. T.,
Hofmann, M. and Mäder, P., 2022, SEOSS-Queries - a Software Engineering Dataset
for Text-to-SQL and Question Answering Tasks, Data in Brief, Elsevier Inc,
42, 108211, https://doi.org/10.1016/j.dib.2022.108211.
[47]. Armin, S.,
Ghahani, V., Khadirsharbiyani, S., Kotra, J. B. and Kandemir, M. T., 2020,
Athena: An Early-Fetch Architecture To Reduce On-Chip Page Walk Latencies, ProcVLDB
Endow, 22, 2747–2759, https://doi.org/10.1145/3559009.3569684.
[48]. Zhang, H., Li,
J., Chen, L., Cao, R., Zhang, Y., Huang, Y. et al., 2023, CSS: A Large-Scale
Cross-Schema Chinese Text-to-SQL Medical Dataset, Proceedings of the Annual
Meeting of the Association for Computational Linguistics, 6970–83,
https://doi.org/10.18653/v1/2023.findings-acl.435.
[49]. Singh, M.,
Dogga, P., Patro, S., Barnwal, D., Dutt, R., Haldar, R. et al, 2018, CL
Scholar: The ACL Anthology Knowledge Graph Miner, NAACL HLT 2018 - 2018
Conference of the North American Chapter of the Association for Computational
Linguistics: Human Language Technologies, Proceedings of the Demonstrations
Session, 16–20, https://doi.org/10.18653/v1/n18-5004.
[50]. Wang, L., Zhang,
A., Wu, K., Sun, K., Li, Z., Wu, H. et al., 2020, DuSQL: A Large-Scale and
Pragmatic Chinese Text-to-SQL Dataset, EMNLP 2020 - 2020 Conference on
Empirical Methods in Natural Language Processing, Proceedings of the Conference,
6923–35, https://doi.org/10.18653/v1/2020.emnlp-main.562.
[51]. Lee, C. H.,
Polozov, O. and Richardson, M., 2021, KaggleDBQA: Realistic Evaluation of
Text-to-SQL parsers, ACL-IJCNLP 2021 - 59th Annual Meeting of the
Association for Computational Linguistics and the 11th International Joint
Conference on Natural Language Processing, Proceedings of the Conference,
2261–73, https://doi.org/10.18653/v1/2021.acl-long.176.
[52]. Lan, W., Wang,
Z., Chauhan, A., Zhu, H., Li, A., Guo, J. et al., 2023, UNITE: A Unified
Benchmark for Text-to-SQL Evaluation.
[53]. Jia, R. and
Liang, P., 2016, Data Recombination for Neural Semantic Parsing, 54th Annual
Meeting of the Association for Computational Linguistics, ACL 2016 - Long
Papers, Association for Computational Linguistics (ACL), 1, 12–22.
https://doi.org/10.18653/v1/p16-1002.
[54]. Wang, Y.,
Berant, J. and Liang, P., 2015, Building a Semantic Parser Overnight, ACL-IJCNLP
2015 - 53rd Annual Meeting of the Association for Computational Linguistics and
the 7th International Joint Conference on Natural Language Processing of the
Asian Federation of Natural Language Processing, Proceedings of the Conference,
1, 1332–42, https://doi.org/10.3115/v1/p15-1129.
[55]. Hou, W. and Nie,
Y., Seq2seq-Attention Question Answering Model.
[56]. Dong, L. and
Lapata, M., 2016, Language to Logical Form with Neural Attention, 54th
Annual Meeting of the Association for Computational Linguistics, ACL 2016 -
Long Papers, p. 33–43, https://doi.org/10.18653/v1/p16-1004.
[57]. Xiao, C.,
Dymetman, M. and Gardent, C., 2016, Sequence-Based Structured Prediction for
Semantic Parsing, 54th Annual Meeting of the Association for Computational
Linguistics, ACL 2016 - Long Papers, p. 1341–50,
https://doi.org/10.18653/v1/p16-1127.
[58]. Goldman, O.,
Latcinnik, V., Naveh, U., Globerson, A. and Berant, J., 2018, Weakly Supervised
Semantic Parsing with Abstract Examples, ACL 2018 - 56th Annual Meeting of
the Association for Computational Linguistics, Proceedings of the Conference
(Long Papers), p. 1809–19, https://doi.org/10.18653/v1/p18-1168.
[59]. Iyer, S.,
Konstas, I., Cheung, A., Krishnamurthy, J. and Zettlemoyer, L., 2017, Learning
a Neural Semantic Parser from User Feedback, ACL 2017 - 55th Annual Meeting
of the Association for Computational Linguistics, Proceedings of the Conference
(Long Papers), p. 963–73, https://doi.org/10.18653/v1/P17-1089.
[60]. Krishnamurthy,
J., Dasigi, P. and Gardner, M., 2017, Neural Semantic Parsing with Type
Constraints for Semi-Structured Tables, EMNLP 2017 - Conference on Empirical
Methods in Natural Language Processing, Proceedings, p. 1516–26,
https://doi.org/10.18653/v1/d17-1160.
[61]. Yin, P. and
Neubig, G., 2017, A Syntactic Neural Model for General-Purpose Code Generation,
ACL 2017 - 55th Annual Meeting of the Association for Computational
Linguistics, Proceedings of the Conference (Long Papers), p. 440–50,
https://doi.org/10.18653/v1/P17-1041.
[62]. Dong, L. and
Lapata, M., 2018, Coarse-to-Fine Decoding for Neural Semantic Parsing, ACL
2018 - 56th Annual Meeting of the Association for Computational Linguistics,
Proceedings of the Conference (Long Papers), p. 731–42,
https://doi.org/10.18653/v1/p18-1068.
[63]. Herzig, J. and
Berant, J., 2018, Decoupling Structure and Lexicon for Zero-Shot Semantic
Parsing, Proceedings of the 2018 Conference on Empirical Methods in Natural
Language Processing, EMNLP 2018, p. 1619–29,
https://doi.org/10.18653/v1/d18-1190.
[64]. Kamath, A. and
Das, R., 2018, A Survey on Semantic Parsing.
[65]. Suhr, A., Iyer,
S. and Artzi, Y., 2018, Learning to Map Context-Dependent Sentences to
Executable Formal Queries, NAACL HLT 2018 - 2018 Conference of the North
American Chapter of the Association for Computational Linguistics: Human
Language Technologies - Proceedings of the Conference, p. 2238–49,
https://doi.org/10.18653/v1/n18-1203.
[66]. Yin, P. and
Neubig, G., 2018, Tranx: A Transition-Based Neural Abstract Syntax Parser for
Semantic Parsing and Code Generation, EMNLP 2018 - Conference on Empirical
Methods in Natural Language Processing: System Demonstrations, Proceedings,
p. 7–12, https://doi.org/10.18653/v1/d18-2002.
[67]. Yin, P., Zhou,
C., He, J. and Neubig, G., 2018, structVae: Tree-Structured Latent Variable
Models for Semi-Supervised Semantic Parsing, ACL 2018 - 56th Annual Meeting
of the Association for Computational Linguistics, Proceedings of the Conference
(Long Papers), 1, 754–65, https://doi.org/10.18653/v1/p18-1070.
[68]. Shi, P., Ng, P.,
Wang, Z., Zhu, H., Li, A. H., Wang, J. et al., 2021, Learning Contextual
Representations for Semantic Parsing with Generation-Augmented Pre-Training, 35th
AAAI Conference on Artificial Intelligence, AAAI 2021, p. 13806–14,
https://doi.org/10.1609/aaai.v35i15.17627.
[69]. Baranowski, A.
and Hochgeschwender, N., 2021, Grammar-Constrained Neural Semantic Parsing with
LR Parsers, Findings of the Association for Computational Linguistics:
ACL-IJCNLP 2021, 1275–9, https://doi.org/10.18653/v1/2021.findings-acl.108.
[70]. Huang, S., Li,
Z., Qu, L. and Pan, L., 2021, On Robustness of Neural Semantic Parsers, EACL
2021 - 16th Conference of the European Chapter of the Association for
Computational Linguistics, Proceedings of the Conference, p. 3333–42,
https://doi.org/10.18653/v1/2021.eacl-main.292.
[71]. Xu, X., Liu, C.
and Song, D., 2017, SQLNet: Generating Structured Queries from Natural Language
without Reinforcement Learning.
[72]. Devlin, J.,
Chang, M. W., Lee, K., Google, K. T. and Language, A.I. BERT: Pre-training of
Deep Bidirectional Transformers for Language Understanding [Internet].
[73]. Yu, T., Li, Z.,
Zhang, Z., Zhang, R. and Radev, D., 2018, TypeSQL: Knowledge-Based type-Aware
Neural Text-to-SQL Generation.
[74]. Bogin, B.,
Gardner, M. and Berant, J., 2019, Global Reasoning Over Database Structures for
Text-to-SQL Parsing.
[75]. Lin, X. V.,
Socher, R. and Xiong, C., 2020, Bridging Textual and Tabular Data for
Cross-Domain Text-to-SQL Semantic Parsing.
[76]. Lewis, M., Liu,
Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O. et al., 2019, BART:
Denoising Sequence-to-Sequence Pre-training for Natural Language Generation,
Translation, and Comprehension.
[77]. Liu, Y., Ott,
M., Goyal, N., Du, J., Joshi, M., Chen, D. et al., 2019, ROBERTa: A Robustly
Optimized BERT Pretraining Approach.
[78]. Lan, Z., Chen,
M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R. et al., 2020, Albert: A
Lite Bert for Self-Supervised Learning of Language Representations [Internet].
[79]. Sanh, V., Debut,
L., Chaumond, J. and Wolf, T., 2019, DistilBERT, A Distilled Version of Bert:
Smaller, Faster, Cheaper And Lighter.
[80]. Wang, B., Shin,
R., Liu, X., Polozov, O. and Richardson, M., 2020, RAT-SQL: Relation-Aware
Schema Encoding and Linking for Text-to-SQL Parsers [Internet].
[81]. [81] Scholak, T., Li, R., Bahdanau, D., de Vries, H.
and Pal, C., 2020, DuoRAT: Towards Simpler Text-to-SQL Models,
https://doi.org/10.18653/v1/2021.naacl-main.103.
[82]. Hui, B., Geng,
R., Wang, L., Qin, B., Li, B., Sun, J. et al., 2022, S$^2$SQL: Injecting Syntax
to Question-Schema Interaction Graph Encoder for Text-to-SQL Parsers.
[83]. N. M. Ndongala,
2023, Light RAT-SQL: A RAT-SQL with More Abstraction and Less Embedding of
Pre-existing Relations. Texila International Journal of Academic Research,
10, 1–11, https://doi.org/10.21522/tijar.2014.10.02.art001.
[84]. N. M. Ndongala,
2024, Topological Relation Aware Transformer, Texila International Journal
of Academic Research, 11, 160–74,
https://doi.org/10.21522/tijar.2014.11.01.art015.
[85]. Cao, R., Chen,
L., Chen, Z., Zhao, Y., Zhu, S. and Yu, K., 2021, LGESQL: Line Graph Enhanced
Text-to-SQL Model with Mixed Local and Non-Local Relations.
[86]. Scholak, T.,
Schucher, N. and Bahdanau, D., 2021, PICARD: Parsing Incrementally for
Constrained Auto-Regressive Decoding from Language Models.
[87]. Herzig, J.,
Nowak, P. K., Müller, T., Piccinno, F. and Eisenschlos, J., 2020, TaPas: Weakly
Supervised Table Parsing via Pre-training,
https://doi.org/10.18653/v1/2020,acl-main.398.
[88]. Deng, X.,
Awadallah, A. H., Meek, C., Polozov, O., Sun, H. and Richardson, M., 2020,
Structure-Grounded Pretraining for Text-to-SQ,
https://doi.org/10.18653/v1/2021,naacl-main.105.
[89]. Wang, W., Bi,
B., Yan, M., Wu, C., Xia, J., Bao, Z. et al., 2019, StructBERT: Incorporating
Language Structures into Pre-training for Deep Language Understanding, 8th
International Conference on Learning Representations, ICLR 2020, International
Conference on Learning Representations, ICLR.
[90]. Zhao, L., Cao,
H. and Zhao, Y., 2021, GP: Context-free Grammar Pre-training for Text-to-SQL
Parsers.
[91]. Yu, T., Wu,
C.-S., Lin, X. V., Wang, B., Tan, Y. C., Yang, X. et al., 2020, GraPPa:
Grammar-Augmented Pre-Training for Table Semantic Parsing.
[92]. Shoeybi, M.,
Patwary, M., Puri, R., Legresley, P., Casper, J. and Catanzaro, B., 2020,
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model
Parallelism [Internet].
[93]. Xu, C., Zhou,
W., Ge, T., Wei, F. and Zhou, M., 2020, BERT-of-Theseus: Compressing BERT by
Progressive Module Replacing [Internet].
[94]. Polu, S., Han,
J. M., Zheng, K., Baksys, M., Babuschkin, I. and Sutskever, I., 2022, Formal
Mathematics Statement Curriculum Learning.
[95]. Liu, Y.,
Dmitriev, P., Huang, Y., Brooks, A. and Dong, L., An Evaluation of Transfer
Learning for Classifying Sales Engagement Emails at Large Scale [Internet].
[96]. Fedus, W., Zoph,
B. and Shazeer, N., 2022, Switch Transformers: Scaling to Trillion Parameter
Models with Simple and Efficient Sparsity, Journal of Machine Learning
Research.
[97]. Radford, A., Wu,
J., Child, R., Luan, D., Amodei, D. and Sutskever, I., 2019, Language Models
are Unsupervised Multitask Learners [Internet].
[98]. Brown, T. B.,
Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P. et al., 2020,
Language Models are Few-Shot Learners [Internet].
[99]. Schick, T. and
Schütze, H., 2020, Exploiting Cloze Questions for Few Shot Text Classification
and Natural Language Inference, EACL 2021 - 16th Conference of the European
Chapter of the Association for Computational Linguistics, Proceedings of the
Conference, Association for Computational Linguistics (ACL), 255–69,
https://doi.org/10.18653/v1/2021, eacl-main.20.
[100]. Gao, T., Fisch,
A. and Chen, D., 2020, Making Pre-trained Language Models Better Few-shot
Learners, ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for
Computational Linguistics and the 11th International Joint Conference on
Natural Language Processing, Proceedings of the Conference, Association
for Computational Linguistics (ACL), 3816–30,
https://doi.org/10.18653/v1/2021,acl-long.295.
[101]. Pourreza, M. and
Rafiei, D., 2023, DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with
Self-Correction.
[102]. [Gur, I., Yavuz,
S., Su, Y. and Yan, X., DialSQL: Dialogue Based Structured Query Generation.
[103]. Li, J., Chen,
L., Cao, R., Zhu, S., Xu, H., Chen, Z. et al., 2023, Exploring Schema
Generalizability of Text-to-SQL, Proceedings of the Annual Meeting of the
Association for Computational Linguistics, 1344–60,
https://doi.org/10.18653/v1/2023,findings-acl.87.
[104]. Li, H., Zhang,
J., Liu, H., Fan, J., Zhang, X., Zhu, J. et al., 2024, CodeS: Towards Building
Open-Source Language Models for Text-to-SQL.
[105]. Molnar, C.,
Interpretable Machine Learning, A Guide for Making Black Box Models Explainable
[Internet].