Yan Gao, Jian-Guang Lou, Dongmei Zhang · Published 2022
This paper presents a novel approach to translating natural language questions to SQL queries for given tables, which meets three requirements as a real-world data analysis application: cross-domain, multilingualism and enabling quick-start. Our proposed approach consists of: (1) a novel data abstraction step before the parser to make parsing table-agnosticism; (2) a set of semantic rules for parsing abstracted data-analysis questions to intermediate logic forms as tree derivations to reduce the search space; (3) a neural-based model as a local scoring function on a span-based semantic parser for structured optimization and efficient inference. Experiments show that our approach outperforms state-of-the-art algorithms on a large open benchmark dataset WikiSQL. We also achieve promising results on a small dataset for more complex queries in both English and Chinese, which demonstrates our language expansion and quick-start ability.

Recommendations

Overall similar

Deep learning approaches to semantic parsing require a large amount of labeled data, but annotating complex logical forms is costly. Here, we propose Syntactic Question Abstraction and Retrieval (SQAR), a method to build a neural …
Large-scale semantic parsing datasets annotated with logical forms have enabled major advances in supervised approaches. But can richer supervision help even more? To explore the utility of fine-grained, lexical-level supervision, we introduce Squall, a dataset …
We present a generative model to map natural language questions into SQL queries. Existing neural network based approaches typically generate a SQL query word-by-word, however, a large portion of the generated results are incorrect or …

Similar task

The digitization of data has resulted in making datasets available to millions of users in the form of relational databases and spreadsheet tables. However, a majority of these users come from diverse backgrounds and lack …
In today's databases, previous query answers rarely benefit answering future queries. For the first time, to the best of our knowledge, we change this paradigm in an approximate query processing (AQP) context. We make the …
A significant amount of the world's knowledge is stored in relational databases. However, the ability for users to retrieve facts from a database is limited due to a lack of understanding of query languages such …

Similar method

Many algorithms for Knowledge-Based Question Answering (KBQA) depend on semantic parsing, which translates a question to its logical form. When only weak supervision is provided, it is usually necessary to search valid logical forms for …
Synthesizing programs from examples requires searching over a vast, combinatorial space of possible programs. In this search process, a key challenge is representing the behavior of a partially written program before it can be executed, …
Semantic parsing from denotations faces two key challenges in model training: (1) given only the denotations (e.g., answers), search for good candidate semantic parses, and (2) choose the best model update algorithm. We propose effective …

Similar dataset

Interacting with relational databases through natural language helps users of any background easily query and analyze a vast amount of data. This requires a system that understands users' questions and converts them to SQL queries …
Relational database management systems (RDBMSs) are powerful because they are able to optimize and answer queries against any relational database. A natural language interface (NLI) for a database, on the other hand, is tailored to …
The digitization of data has resulted in making datasets available to millions of users in the form of relational databases and spreadsheet tables. However, a majority of these users come from diverse backgrounds and lack …