Our tech team at MEV is constantly exploring new ways to leverage existing platforms to innovate and solve various problems. Recently, our team explored how we can leverage ANTLR to develop a highly customizable and efficient search engine and to visualize room layouts.
While these use cases are specific to the real estate domain, the solutions can be applied broadly across industries. Our team leverages scientific and technical knowledge, practical expertise, and craftship to develop optimal long-term solutions rather than just quick fixes.
There are various options available for searching published real estate listings, such as Zillow, RedFin, and the MLS. These systems have various limitations and aren’t limited to a specific database. With ANTLR, we can develop a highly custom search system to efficiently find listings in an established database, leveraging user criteria and custom parameters.
We wanted to build a highly personalized and efficient search system that allows users to navigate a broad spectrum of options using over 300 parameters, including:
In our case, the user may not even remember all these parameters, so we want to also suggest options for them. For example, if they want to find real estate in a specific location, we provide suggestions such as mountainous terrain, proximity to the ocean, distance from a volcano, and more. All of these are offered to the client when searching under the “Location” parameter.
The client should be able to migrate search queries created in another system with a different format into our system. Furthermore, we have implemented this capability and were able to identify and correct logical errors and fix incorrect queries. In simple terms, a client in another system configured a search, attempting to find housing within walking distance to the sea near New York. We migrated his search query to our system and caught a logical error.
We had to ensure the ease of query formation on the user's side (frontend). Additionally, we needed the ability to analyze these queries on the server side (backend). Finally, we had to translate these queries into a language understandable by the Elasticsearch search server or any other data storage that might be utilized in the future.
We had a metadata subsystem in place where fields for filters, their types, names, etc., were described as a set of server-side structures. Based on this and our goal, we concluded that the language would have a grammar resembling the concept of a predicate. More precisely, it would leverage a set of predicates combined with each other using the conjunction operation.
To implement the functionality of analyzing and translating queries on the server side, we chose ANTLR4. ANTLR (ANother Tool for Language Recognition) is a powerful parser generator used for reading, processing, executing, or translating structured text or binary files. It is widely employed in creating languages, tools, and frameworks.
Using the ANTLR grammar, it generates a parser capable of creating and traversing parse trees. Therefore, having defined the grammar of our query language for recognition, we need to describe it in ANTLR.
Lexer and Parser are the two main components used for text processing in ANTLR. These components perform different tasks and utilize different types of rules to analyze the input text.
The Lexer is responsible for lexical analysis of the text, which involves breaking down the input text into lexemes. A lexeme is the smallest unit of text that can be distinguished. The Lexer transforms the input text into a sequence of lexemes, each of which has properties such as the token type and value. The Lexer is also responsible for removing spaces, comments, and other characters that should not be considered in further analysis.
Parser, on the other hand, is responsible for syntactic analysis of tokens, meaning it checks whether the sequence of tokens conforms to the syntactic rules described in the grammar. The parser transforms the sequence of tokens into a syntax tree, representing the structure of the input text.
It is important to understand that the lexer differs from the syntax analyzer in that it works with tokens rather than a syntax tree. The lexer has access only to the current token, while the syntax analyzer can analyze all tokens and perform more complex logic.
So, if you think of parsing an input file as extracting information from the text, the lexer is responsible for finding tokens, and the syntax analyzer is responsible for executing logic. The lexer generates a sequence of tokens that is passed to the syntax analyzer (parser), which uses the grammar rules to check the correctness of the syntax. If the syntax conforms to the grammar, the parser generates a syntax tree representing the structure of the query in our case.
To make the development of grammar in ANTLR4 more convenient and productive, we utilized the “ANTLR v4" plugin for JetBrains IDEs, such as IntelliJ IDEA. This plugin assists developers in easily creating, editing, and debugging ANTLR4 language grammars.
Leveraging ANTLR4, we addressed several challenges, including:
In our case, developing an individual query language based on ANTLR4 allowed us to create a powerful tool for syntactic analysis and processing of input queries. Currently, we can easily modify the language grammar to meet our needs, update the codebase of the ANTLR4-based functionality, and expand or modify the translation functionality as needed.
Next, we will show another example of using ANTLR4 in real estate technology.
We use cookies to bring best personalized experience for you. Check our Privacy Policy to learn more about how we process your personal data
Accept allPrivacy is important to us, so you have the option of disabling certain types of storage that may not be necessary for the basic functioning of the website. Blocking categories may impact your experience on the website. More information