I asked Gemini why Perpelexity does well at B4J. Here is what it replied:
Perplexity AI's strength with B4J, despite it being a niche language, comes from its unique architecture, which is fundamentally different from a purely local, "dumb" LLM running on Ollama.
Here's why Perplexity AI likely does well with B4J:
Real-time Web Search and Information Retrieval: This is the absolute biggest factor. Unlike a local LLM that only knows what it was trained on, Perplexity AI is an "answer engine" or "conversational search engine."1 When you ask it a question, it:
Performs a real-time web search: It actively queries the internet using your prompt (or refined versions of it).2
Indexes and processes search results: It intelligently reads through the top web pages returned from its search.
Synthesizes answers from the retrieved information: It then uses its underlying LLMs to generate a concise answer based on the content it just found on the web, along with citations.3
Implication for B4J: This means that if there's any information about B4J online – documentation, forum posts, code examples, tutorials, discussions about common B4J patterns, or how B4J relates to Java – Perplexity AI can find it and use that information to construct a relevant answer. It doesn't need to have B4J in its initial training data; it retrieves it on demand.
Access to Powerful, General-Purpose LLMs (often larger than what you can run locally): Perplexity AI uses a combination of its own models (like Sonar) and can also leverage access to leading commercial models like:
GPT-4 / GPT-4.5 / GPT-4o (or similar sized models from OpenAI)4
Claude 3 / Claude 3.5 (from Anthropic)
Gemini (from Google)5
These are much larger and more capable models than anything you can realistically run on an 8GB Jetson Orin Nano. These larger models have a more extensive general knowledge of programming concepts, common patterns, and highly nuanced understanding of major languages like Java. When Perplexity retrieves information about B4J, these powerful models are then better equipped to understand and synthesize that information into actionable B4J-related code or explanations.
Contextual Understanding and Iterative Refinement: Perplexity is designed for multi-turn conversations and often refines its internal search queries and responses based on your follow-up questions.6 This iterative process helps it hone in on the specific B4J context you're discussing.
"Understanding" of Java: Since B4J sits on top of Java, Perplexity AI's underlying LLMs have a deep understanding of Java itself. When it retrieves B4J-related information, it can often bridge the gap by connecting B4J concepts to their Java equivalents, which is incredibly useful for problem-solving.
In essence:
Perplexity AI doesn't "know" B4J because it was explicitly trained on a massive B4J dataset. Instead, it "learns" about B4J on the fly by searching the internet for relevant B4J content and then applying the reasoning and generation capabilities of powerful, often much larger, general-purpose LLMs to that retrieved information.