llm
Chat with multiple AI models side-by-side. Compare ChatGPT, Claude, Gemini, and other top LLMs. Crowdsourced benchmarks and leaderboards.
First open-source code agent for Lean 4.
Git is the coding agent. Contribute to adamveld12/ghost development by creating an account on GitHub.
Discover how Anthropic approaches the development of reliable AI agents. Learn about our research on agent capabilities, safety considerations, and technical framework for building trustworthy AI.
Experiments with getting usable outputs out of local models on a standard Macbook
A from-the-ground-up walkthrough of how modern LLMs work, from tokens to transformer blocks to the next-token loop
Major release focused on extensibility, expanded provider support, and enhanced user experience.
The history of observability tools over the past decade has been about a pretty simple concept, but LLMs bring the death of that paradigm.
Claude Sonnet 4.6 is a full upgrade of the model’s skills across coding, computer use, long-reasoning, agent planning, knowledge work, and design.
Announcing a pilot test of a new Claude browser extension
Build custom Skills to teach Claude specialized tasks. Create once, use everywhere—from spreadsheets to coding. Available across Claude.ai, API, and Code.
Deterministic browser automation. Works out of the box with Claude/Codex/OpenCode - theredsix/agent-browser-protocol
An early update on what we've learned from Project Glasswing.
Multi-lens code audit tool — 280 expert AI agents for code review, security testing, and infrastructure auditing - TheMorpheus407/RepoLens
How I deployed nullclaw as a public-facing AI agent on a $5 perimeter box with IRC, tiered inference, and Cloudflare-proxied WebSocket, and why the architecture matters more than the model.
Claude Skills for Obsidian. Contribute to kepano/obsidian-skills development by creating an account on GitHub.
The open source "Jarvis" chats via WhatsApp but requires access to your files and accounts.
Coordinate multiple Claude Code instances working together as a team, with shared tasks, inter-agent messaging, and centralized management.
Pocket Flow: Codebase to Tutorial. Contribute to The-Pocket/PocketFlow-Tutorial-Codebase-Knowledge development by creating an account on GitHub.
Hmmm ... I'm not sure about this. It's interesting, but I'm not yet convinced about it's place.
A senior Google engineer publicly praises Anthropic's Claude Code: the tool built in one hour what her team spent a year developing. The quality and efficiency gains exceed anything anyone could have imagined, she says. Plus: Claude Code's creator shares his best workflow tips.