# Nosible — Research

> Long-form research from the Nosible team. Each post has a Markdown mirror at
> /agents/blog/<slug>.md.

**URL:** https://nosible.com/blog

## Posts

- [Matching GPT-5.1 at Financial Sentiment with Active Learning and Qwen3](https://nosible.com/blog/fast-enough-to-matter-productionizing-tiny-transformers-for-signal-extraction) — 2025-12-12. Here's how we fine-tuned Qwen3 0.6B to beat FinBERT and match GPT-5.1 accuracy. Complete with open-source models, datasets, and training scripts. Spoiler alert: active learning is all you need.
- [Can Faceted Search at Web-Scale Self Organize?](https://nosible.com/blog/can-faceted-search-at-web-scale-self-organize) — 2025-10-16. Can Faceted Search at Web-Scale Self Organize? As it turns out, yes it can! In this post we outline our new and improved adaptive named entity tagging system!
- [Introducing Cybernaut-1: Agentic Search using MCTS](https://nosible.com/blog/introducing-cybernaut-1-agentic-search-with-mcts) — 2025-08-26. Cybernaut-1 combines our powerful hybrid-3 search algorithm with LLM-guided Monte Carlo Tree Search to deliver world class search results on difficult queries.
- [The Road to Cybernaut-1: Rebuilding Search for AI](https://nosible.com/blog/the-road-to-cybernaut-1) — 2025-08-20. AI needs its own search engine. This is how we’re rebuilding search for AI -- and the road to Cybernaut-1, the first high-trust agentic search engine.
- [A Pattern for Scaling the Value Proposition of LLMs: Ensemble and Distil 🚀](https://nosible.com/blog/ensemble-and-distil) — 2024-02-06. We introduce the ensemble and distil data pattern and use it to fit an ordinary least squares linear regression that outperforms GPT-4 at financial news sentiment classification using sentence transformer embeddings as features.
- [News Sentiment Showdown: Who Checks Vibes Best?](https://nosible.com/blog/news-sentiment-showdown-who-checks-vibes-best) — 2024-01-28. A comparison of sentiment classifications made by TextBlob, VADER, Flair, SigmaFSA, FinBERT, FinBERT-Tone, Text-Bison, Text-Unicorn, Gemini-Pro, GPT-3.5, GPT-4, and GPT-4-Turbo. We look at accuracy, time, and cost and include a dataset of 10,368 labelled news stories (with code) for our followers.
- [Using Vector Search to See Signals in Company News](https://nosible.com/blog/using-vector-search-to-see-signals-in-company-news) — 2024-01-21. How we use vector search to extract investment signals from a multi-terabyte company news dataset that currently contains over 55 million embeddings, 150+ million sentences, 4+ billion words, and 5+ billion GPT tokens.