poison-web
AI Model Payload Injection Toolkit - Research tool for embedding invisible payloads in web content. Because sometimes the most dangerous threats are the ones you can't see coming.
GitHub RepoA place to capture my various product ideas!
AI Model Payload Injection Toolkit - Research tool for embedding invisible payloads in web content. Because sometimes the most dangerous threats are the ones you can't see coming.
GitHub RepoDaily forecasting pipeline that ingests the news, distills the signal, publishes probabilistic predictions, and threads community feedback into a Markdown briefing.
GitHub RepoMeta-learning with Qwen using GSPO and MLX for efficient training on Apple Silicon. Built a framework that teaches AI models to learn faster by learning how to learnâbecause the best AI is one that can teach itself new tricks.
GitHub RepoHuman research behavior is complexâso built Mneme to capture it. Transforms raw browsing data into GRPO-ready trajectories that teach AI agents to replicate how humans actually research, because the best AI learns from real human patterns.
GitHub RepoManaging AI agents shouldn't require a PhD in prompt engineering. Built Bridge as Mission Control for AI fleetsâa CEO-appropriate interface that makes orchestrating AI agents as simple as managing a team of humans.
GitHub RepoGoogle Docs are great for collaboration, but they're not Git-friendly. Built Branch to sync Google Docs into local Git repositoriesâbecause version control shouldn't be limited to code. Now you can diff, blame, and branch your documents like any other source code.
GitHub RepoDrowning in documents but need a clear report? Built Livy to read the mountain, find the gold, and draft the summary. Because your brainpower is too precious for CTRL+F.
GitHub RepoAnalysing ethical behaviour of LLMs. Because AI that can't tell right from wrong is just a very smart sociopath waiting to happen.
GitHub RepoChat apps are everywhere, but what if they could pay for themselves? Built an AI chat platform that seamlessly integrates contextual advertisingâbecause great conversations shouldn't cost a fortune to host.
GitHub RepoClaude-powered research assistant that blends semantic RAG discovery with tool-driven deep dives and auto-evaluation loops, so questions come back with sourced, defensible answers.
GitHub RepoBuilt a web interface that transforms natural language queries into SQL using a purpose-built DSL. Connect to any database, ask questions in plain English, and get instant insights powered by LLMs.
GitHub RepoBuilt an AI writer that learns from social media feeds using GRPO reinforcement learning. The cool part? It teaches itself to write posts that actually resonate with real audiences by learning from what works.
GitHub RepoWatching LLMs process one by one felt like a dial-up flashback. Built ParaLLM to let them all run wild in parallelâbecause real AI work needs warp speed.
GitHub RepoLife sciences need task-level evals, so Galen runs mission-style workflowsâdata extraction, analysis, reportingâto show which LLMs can actually keep up in the lab.
GitHub RepoCurious if a tiny 0.5B LLM could bluff its way through poker with some RL coaching. Turns out, even small models can learn a mean poker face.
GistUses SlopRank and YamLLMs to rank LLM agents. Because not all AI agents are created equalâsome are just sloppier than others.
GitHub RepoLLM evaluation: To see how LLMs *really* stack up at playing poker, I built a casino where AIs (and you!) can go all-in. Because strategy is the ultimate LLM test.
GitHub RepoRankings are often shallow, just leaderboards lacking depthâso LLMRank lets models critically judge each other. Because when everyone's good, you need nuance to know who's great.
GitHub RepoTask specific industry evaluations. Because generic benchmarks are like using a butter knife for brain surgeryâyou need the right tool for the job.
GitHub RepoCellular Automata are mesmerizingly complex, but predicting their next move is incredibly difficult. So I trained a transformer to crack it, then had it evolve even smarter versions of itself, to test how new model ensembles could be created.
GitHub RepoLLM evaluation: Picking the right LLM for a job is often a guessing game. Built a rigorous bootcamp for models: custom evals, RAG, DB hookups, the works. Because choosing your AI shouldn't be a shot in the dark.
GitHub RepoLLM evaluation: LLM outputs can be slick, but how *good* are they, really? Created a linguistic detective kit to dissect their prose. Because style *and* substance matter.
GitHub RepoLLM evaluation: LLMs are great at essays, but can they *reason*? So I created an env to throw Wordle, Sudoku, and other logic puzzles at them with LOOP Evals. Because true smarts mean more than just smooth talk.
GitHub RepoEveryone wants AI to tell stories, but most attempts are... meh. Coaxed an AI to write surprisingly decent six-part tales with Twain. Because storytelling is an art, even for machines.
GitHub RepoLLMs often just steamroll ahead, even when they're wrong. Gave them a 'pause and rethink' button. ReflectGPT lets them catch their own blunders and try againâbecause real intelligence means admitting you messed up.
GitHub RepoCoworker for Life Sciences. Because even AI needs to prove it can handle the complex world of biology before we trust it with our health.
GitHub RepoLLM evaluation: Advanced manufacturing needs AI, but generic evals are quite bad. Designed custom benchmarks to find the sharpest AI tools for the factory floor. Because precision matters, from code to an assembly line.
GitHub RepoThe Jones Act is famously complexâso, I had a bit of fun turning it into a strategy game. Navigating bureaucracy has never been so (intentionally) frustratingly fun.
GitHub RepoConway's Game of Life is a classic. What happens when you give it a quantum spin? QCGOL explores that rabbit hole.
GitHub RepoInvestment analysis is tough. Sketched out 'Prof,' an AI mentor to train aspiring analysts. The core idea? Even complex finance can be taught with smart AI.
GitHub RepoThe Pomodoro timer is great, but felt a bit... analog. Infused it with LLM smarts to create Fomodoroâa focus tool that watches your computer and makes sure you're staying on task, the way a good productivity tool should.
GitHub RepoWhat if an LLM could teach itself to be better? Autotune_GPT is that feedback loop: AI improving AI.
GitHub RepoComputers got boring, so I made mine talk back. Turns out an AI voice that's sharp, helpful, and just a bit snarky made working a lot more fun.
GitHub RepoThe VC world is a maze. Built 'Mini VC' to simulate the hustle and find patterns in the chaos. Even a simulated investor needs a good thesis.
GitHub RepoTesting ground for new ideas and experiments. Because every great project starts with a simple test.
GitHub RepoWe need new form factors to use AI. Made a Chrome extension that brings GPT smarts to any webpageâhighlight, click, understand.
GitHub RepoBefore 'agents' were all the rage, there was Loopy: an early experiment in getting LLMs to think, observe, and act.
GitHub RepoNeeded a way to automate Slack chores and actually *use* the data. Built a bot that not only messages but also neatly logs everything for analysis.
GitHub RepoJumping into a new Python codebase can be daunting. Repo Reader uses GPT to give you the lay of the landâlike a friendly AI guide for unfamiliar code.
GitHub RepoHow does information *really* flow and grow? Dove into network analysis to see how connections and shapes impact our collective brainpower.
GitHub RepoScrape a website and all sub-domains. Because sometimes you need to see what's hiding beneath the surface.
GitHub RepoGetting LLMs to play by the rules is a big deal (and a fun puzzle). A cheeky way to keep their creativity in checkâmaybe even accidentally solved alignment. Mostly kidding. Maybe.
GitHub RepoAlso before agents were du jour: What if code could write itself... and then actually run? Autocoder was my dive into that meta-dream. Because the ultimate dev tool builds itself.
GitHub RepoGrowth and innovation are the engines of progress, but what *really* drives them? Dug into the data to find out. Because understanding the past is key to building a cooler future.
GitHub RepoCompany knowledge felt scattered across Docs and Slack. Built a system to hoover it all up, pop it in a DB, and let you ask it anything.
GitHub RepoWanted GPT answers with actual Google-backed sources. GPT-Search was born. Yeah, should've turned it into Perplexityâhindsight's 20/20, but still a cool hack!
GitHub RepoChoose a file, extract text from it, and recursively summarise. Because sometimes you need to turn a novel into a tweet.
GitHub RepoSome basic calls to use GPT I wanted to put in one place so I don't have to call them again and again. Because reinventing the wheel is so 2022.
GitHub Repo