AI coding assistants can give developers an immediate boost, but without the proper context behind your codebase, that value is limited.
Large-language models (LLMs), with their exceptional natural language processing capabilities, are expected to revolutionize various industries. Their advanced text generation and translation abilities have already streamlined content creation and customer interactions. They have enabled more efficient data analysis, enhanced decision-making processes in fields from finance to healthcare, and become the engine for tools like AI coding assistants.
In just a year, AI coding assistants have significantly bolstered software development productivity by providing code suggestions, detecting bugs, and automating repetitive tasks. However, the suggestions these tools make are not always useful, requiring developer effort to refine the prompt.
General purpose AI coding assistants
Major technology companies quickly entered the race to create AI-driven tools and assistants. GitHub Copilot, specifically designed for coding, debuted in 2021. Then, in 2022, OpenAI’s ChatGPT made millions of people aware of what a large language model can do.
In early 2023, Replit introduced GhostWriter, and in the early fall of the same year, IBM’s Watsonx Code Assistant, Amazon CodeWhsiperer, Tabine, and Google’s Project IDX were all launched. Open-source alternatives like CodeGeeX from Tsinghua University also contribute to this highly competitive landscape.
These are general purpose coding assistants because they have been trained on vast amounts of publicly available data. The downside of that approach is that they know nothing about an organization’s specific domain and problems. Moreover, even the most extensive models, such as GPT-4, struggle to memorize knowledge that isn’t frequently mentioned in their training data. Consequently, their answers can be generic and not contextualized to one’s specific domain.
The challenges of using AI-powered code assistants
Concerns with general-purpose LLM-based tools go beyond the issue of generating generic answers.
Another concern is their propensity to produce inaccurate responses which require additional user effort to refine prompts, potentially increasing developer cognitive load and time investment.
Moreover, these tools face a significant limitation in their inability to access or integrate specific knowledge unique to a team or organization.
Inconsistency in the responses
LLMs can exhibit inconsistent responses, sometimes delivering accurate answers while producing random or seemingly clueless information, due to their lack of genuine understanding, otherwise known as hallucinations. These responses are characterized by the generation of content that, although it appears correct, is nonsensical or even incorrect. This mainly occurs due to errors in the encoding or decoding process between texts and intermediate representations.
Extensive prompt engineering
Another issue with general-purpose AI coding assistants is the substantial reliance on prompt engineering. Recent research highlights the considerable challenge posed by this requirement, as even minor adjustments to the input prompt can yield significant variations in the output. This sensitivity further complicates the usability and reliability of these tools for developers.
Lack of domain-specific knowledge
While tools like GitHub Copilot offer valuable insights into general programming concepts, they often fail to translate specific organizational requirements into code, making it challenging to bridge the gap between high-level project specifications and actual code implementation.
Consequently, developers using these tools often need to extensively refactor the generated code to align it with specific project requirements, coding standards, and other contextual factors.
Contextualized AI coding assistants
There have been industry efforts to mitigate these issues. One particular approach is Retrieval Augmentation Generation (RAG). Introduced by Facebook in 2020, this method involves retrieving relevant documents from a database and then feeding these documents, along with the original question, to the LLM.
RAG is designed to enhance LLM-generated content by anchoring it in external knowledge sources and relevant documents that could enrich the prompt sent to the LLM. In question-answering systems, RAG accesses up-to-date, reliable information and provides transparency to users regarding the model’s information sources, promoting trust and verifiability.
Recent research shows that such retrieval-based techniques often enhance the generated responses by relying on contemporary and accurate data. Another research study suggests that using retrieval-based techniques that power contextualized AI coding assistants could reduce users’ barriers to adopting LLM-based tools, while reducing the significant costs of training.
RAG utilizes databases to store information and retrieves it as needed to supply the LLMs. By preparing relevant questions and associated reference materials in advance, the model leverages these resources to produce more accurate and reliable answers.
Why you should use a contextualized AI coding assistant
Developers, including myself, were excited about using general-purpose AI assistants in various aspects of the software development cycle, but their utility is often limited by context. Consider the scenario of asking a friend for movie recommendations. They might suggest a scary film if they are unaware that you prefer action films or dislike horror movies. Without context, even the best recommendations may miss the mark.
Similarly, tools like ChatGPT can offer generic or unhelpful responses if they don’t grasp a developer’s specific situation. To address this gap, contextualized AI coding assistants are tailored to understand developers’ unique needs more effectively.
Due to the strong need for more specialized AI coding assistants that can ingest organization-specific information and deliver tailored, robust solutions, we developed StackSpot AI.
The main idea behind StackSpot AI is that we could generate better responses if we could enrich the prompt with contextualized information, that is, our knowledge sources. We use Facebook’s RAG approach to support this enriching process.
Examples of representative knowledge sources that StackSpot AI supports include:
- An extensive catalog of APIs recurrently harnessed by the development team;
- Code snippets serving for coding paradigms or facilitating code modernization activities;
- Customized artifacts written in natural language, including but not limited to documentation, guidelines delineating the protocol for repository commits, and a comprehensive list of software requirements to be implemented.
Give it a try in StackSpot AI for free here!