llm
Demystifying evals for AI agents
AI-generated images have left us questioning what is real. But the godfather of digital forensics, Hany Farid, is not giving up
An encyclopedia of a universe that does not exist until you visit it.
What happened? In 1.110, we added a setting to add Copilot as coauthor in commit messages by appending Co-authored-by: Copilot [email protected]. The setting git.addAICoAuthor has three different values: off - no attribution no matter w...
Use Claude Code's autonomous agent loop with DeepSeek V4 Pro, OpenRouter, or any Anthropic-compatible backend. Same UX, 17x cheaper. - aattaran/deepclaude
Visual Studio Code. Contribute to microsoft/vscode development by creating an account on GitHub.
A TDD-driven iterative feedback loop for software development. 16 cohesive Claude Code skills walk an idea from brainstorm → plan → execute → iterate, with checkpoints throughout. - evanklem/evanflow
Multi-lens code audit tool — 280 expert AI agents for code review, security testing, and infrastructure auditing - TheMorpheus407/RepoLens
Is security spending more tokens than your attacker?
Our latest model, Claude Opus 4.7, is now generally available. Opus 4.7 is a notable improvement on Opus 4.6 in advanced software engineering, with particular gains on the most difficult tasks.
A demo combining LeCroy oscilloscope control, SPICE simulation, and Claude Code.
Today, we’re launching Claude Design, a new Anthropic Labs product that lets you collaborate with Claude to create polished visual work like designs, prototypes, slides, one-pagers, and more.
Put Claude Code on autopilot. Define routines that run on a schedule, trigger on API calls, or react to GitHub events from Anthropic-managed cloud infrastructure.
For eight years, I’ve wanted a high-quality set of devtools for working with SQLite. Given how important SQLite is to the industry1, I’ve long been puzzled that no one has invested in building a really good developer experience for it2. A couple of weeks ago, after ~250 hours of effort over three months3 on evenings, weekends, and vacation days, I finally released syntaqlite (GitHub), fulfilling this long-held wish. And I believe the main reason this happened was because of AI coding agents4. Of course, there’s no shortage of posts claiming that AI one-shot their project or pushing back and declaring that AI is all slop. I’m going to take a very different approach and, instead, systematically break down my experience building syntaqlite with AI, both where it helped and where it was detrimental. I’ll do this while contextualizing the project and my background so you can independently assess how generalizable this experience was. And whenever I make a claim, I’ll try to back it up with evidence from my project journal, coding transcripts, or commit history5.
Why the moat is the system, not the model
A new open-source penetration testing framework called METATRON is gaining attention in the security research community for its fully offline, AI-driven approach to vulnerability assessment.
The one where I pack up my bags
Anthropic accidentally shipped a source map in their npm package, exposing the full Claude Code source. Here's what I found inside.