17 comments

  • ctippett 4 hours ago
    As someone who just set up mitmproxy to do something very similar, I wish this would've been a plugin/add-on instead of a standalone thing.

    I know and trust mitmproxy. I'm warier and less likely to use a new, unknown tool that has such broad security/privacy implications. Especially these days with so many vibe-coded projects being released (no idea if that's the case here, but it's a concern I have nonetheless).

    • jmuncor 4 hours ago
      Agee! This was a fun project that I build because it is so hard to understand what "really" is in you context window... What do you mean by plugin/add-on? Add-on to what? Thinking of what to add to it next... Maybe security would be a good direction, or at least visibility of what is happening to the proxy's traffic.
      • mikehotel 2 hours ago
        • jmuncor 1 hour ago
          What would you think of simply using an http relay for all providers? Would that make you feel better secutity wise? also could extend the tool to change the context you are sending and make it more granular to what you want/need...
  • EMM_386 6 hours ago
    This is great.

    When I work with AI on large, tricky code bases I try to do a collaboration where it hands off things to me that may result in large number of tokens (excess tool calls, unprecise searches, verbose output, reading large files without a range specified, etc.).

    This will help narrow down exactly which to still handle manually to best keep within token budgets.

    Note: "yourusername" in install git clone instructions should be replaced.

    • winchester6788 1 hour ago
      I had a similar problem, and when claude code (or codex) is running in sandbox, i wanted to put a cap or get notified on large contexts.

      especially, because once x0K words crossed, the output becomes worser.

      https://github.com/quilrai/LLMWatcher

      made this mac app for the same purpose. any thoughts would be appreciated

    • cedws 4 hours ago
      I've been trying to get token usage down by instructing Claude to stop being so verbose (saying what it's going to do beforehand, saying what it just did, spitting out pointless file trees) but it ignores my instructions. It could be that the model is just hard to steer away from doing that... or Anthropic want it to waste tokens so you burn through your usage quickly.
      • egberts1 2 hours ago
        Simply assert that :

        you are a professional (insert concise occupation).

        Be terse.

        Skip the summary.

        Give me the nitty-gritty details.

        You can send all that using your AI client settings.

    • kej 5 hours ago
      Would you mind sharing more details about how you do this? What do you add to your AI prompts to make it hand those tasks off to you?
    • jmuncor 5 hours ago
      Hahahah just fixed it, thank you so much!!!! Think of extending this to a prompt admin, Im sure there is a lot of trash that the system sends on every query, I think we can improve this.
  • Havoc 5 hours ago
    You don't need to mess with certificates - you can point CC at a HTTP endpoint and it'll happily play along.

    If you build a DIY proxy you can also mess with the prompt on the wire. Cut out portions of the system prompt etc. Or redirect it to a different endpoint based on specific conditions etc.

    • jmuncor 5 hours ago
      Have you tried this with Gemini? or Codex?
      • thehamkercat 4 hours ago
        Have tried with gemini-cli and claude-code both, it works, honestly, it should work with most if not all cli clients
        • jmuncor 1 hour ago
          Working on this feature right now!! Thank you for the suggestion, will start the branch for it... Whent think of improving the context window usage, now that with an http relay we can start thinking of intercepting the context window, anything that you think could be cool to implement?
          • jmuncor 51 minutes ago
            Got it on the feature branch http-relay, let me know what you think!
  • david_shaw 6 hours ago
    Nice work! I'm sure the data gleaned here is illuminating for many users.

    I'm surprised that there isn't a stronger demand for enterprise-wide tools like this. Yes, there are a few solutions, but when you contrast the new standard of "give everyone at the company agentic AI capabilities" with the prior paradigm of strong data governance (at least at larger orgs), it's a stark difference.

    I think we're not far from the pendulum swinging back a bit. Not just because AI can't be used for everything, but because the governance on widespread AI use (without severely limiting what tools can actually do) is a difficult and ongoing problem.

    • LudwigNagasena 5 hours ago
      I had to vibe code a proxy to hide tokens from agents (https://github.com/vladimirkras/prxlocal) because I haven’t found any good solution either. I planned to add genai otel stuff that could be piped into some tool to view dialogues and tool calls and so on, but I haven’t found any good setup that doesn’t require lots of manual coding yet. It’s really weird that there are no solutions in that space.
      • dtkav 3 minutes ago
        nice, I'm working on something similar with macroons so the tokens can be arbitrarily scopes in time and capability too.

        Mine uses an Envoy sidecar on a sandbox container.

        https://github.com/dtkav/agent-creds

    • daxfohl 2 hours ago
      Yes, I was just thinking about how, as engineers, we're trained to document every thought that has ever crossed our minds, for liability and future reference. Yet once an LLM is done with its task, the "hit by a bus" scenario takes place immediately.
  • vitorbaptistaa 1 hour ago
    That looks great! Any plans on allowing exports to OpenTelemetry apps like Arize Phoenix? I am looking for ways to connect my Claude Code using Max plan (no API) to it and the best I found was https://arize.com/blog/claude-code-observability-and-tracing..., but it seems kinda overweight.
    • cetra3 1 hour ago
      Yeah would love this for logfire
      • jmuncor 34 minutes ago
        Something like sherlock start --otel-endpoint?
        • vitorbaptistaa 2 minutes ago
          Yes. It can get a bit more complex as some otels require authentication. You can check Pydantic AI Gateway, Cloudflare AI Gateway or LiteLLM itself. They do similar things. One advantage of yours would be simplicity.
  • daxfohl 2 hours ago
    Pretty slick. I've been wanting something like this that gets stored with a hash that is stored in the corresponding code change commit message. It'd be good for postmortems of unnoticed hallucinations, and might even be useful to "revive" the agent and see if it can help debug the problem it created.
  • winchester6788 1 hour ago
    interesting that you chose to go the MITM way.

    https://github.com/quilrai/LLMWatcher

    here is my take on the same thing, but as a mac app and using BASE_URL for intercepting codex, claude code and hooks for cursor.

  • the_arun 3 hours ago
    I understand this helps if we have our own LLM run time. What if we use external services like ChatGPT / Gemini (LLM Providers)? Shouldn't they provide this feature to all their clients out of the box?
    • jmuncor 42 minutes ago
      This works with claude code and codex... So you can use with any of those, you dont need a local llm running... :)
  • FEELmyAGI 6 hours ago
    Dang how will Tailscale make any money on its latest vibe coded feature [0] when others can vibe code it themselves? I guess your SaaS really is someones weekend vibe prompt.

    [0]https://news.ycombinator.com/item?id=46782091

    • 3abiton 5 hours ago
      That's what LLMs enabled. Faster prototyping. Also lots of exposed servers and apps. It's never been more fun to be a cyber security researcher.
      • jmuncor 5 hours ago
        I think it just has been more fun being into computers overall!
        • pixl97 5 hours ago
          It's interesting because if you're into computers it's more accessible than ever and there are more things you can mess with more cheaply than ever. I mean we have some real science fiction stuff going on. At the same time it's probably different for the newer generations. Computers were magical to me and a lot of that was because they were rare. Now they are everywhere, they are just a backdrop to everything else going on.
          • jmuncor 5 hours ago
            I agree, I remember when the feed forward NN were the shit! And now the LLMs are owning, I think this adoption pattern will start pulling a lot of innovations on other computer science fields. Networking, for example. But the ability to have that peer programer next to you makes it so much more fun to build, when before you had to spend a whole day debugging something, Claude now just helps you out and gives you time to build. Feels like long roadtrips with cruise control and lane keeping assist!
  • mrbluecoat 6 hours ago
    So is it just a wrapper around MitM Proxy?
    • guessmyname 6 hours ago
      > So is it just a wrapper around MitM Proxy?

      Yes.

      I created something similar months ago [*] but using Envoy Proxy [1], mkcert [2], my own Go (golang) server, and Little Snitch [3]. It works quite well. I was the first person to notice that Codex CLI now sends telemetry to ab.chatgpt.com and other curiosities like that, but I never bothered to open-source my implementation because I know that anyone genuinely interested could easily replicate it in an afternoon with their favourite Agent CLI.

      [1] https://www.envoyproxy.io/

      [2] https://github.com/FiloSottile/mkcert

      [3] https://www.obdev.at/products/littlesnitch/

      [*] In reality, I created this something like 6 years ago, before LLMs were popular, originally as a way to inspect all outgoing HTTP(s) traffic from all the apps installed in my macOS system. Then, a few months ago, when I started using Codex CLI, I made some modifications to inspect Agent CLI calls too.

      • tkp-415 6 hours ago
        Curious to see how you can get Gemini fully intercepted.

        I've been intercepting its HTTP requests by running it inside a docker container with:

        -e HTTP_PROXY=http://127.0.0.1:8080 -e HTTPS_PROXY=http://host.docker.internal:8080 -e NO_PROXY=localhost,127.0.0.1

        It was working with mitmproxy for a very brief period, then the TLS handshake started failing and it kept requesting for re-authentication when proxied.

        You can get the whole auth flow and initial conversation starters using Burp Suite and its certificate, but the Gemini chat responses fail in the CLI, which I understand is due to how Burp handles HTTP2 (you can see the valid responses inside Burp Suite).

        • paulirish 5 hours ago
          Gemini CLI is open source. Don't need to intercept at the network when you can just add inspectGeminiApiRequest() in the source. (I suggest it because I've been maintaining a personal branch with exactly that :)
        • jmuncor 5 hours ago
          Tried with gemini and gave more headaches than anything else, would love if you can help me adding it to sherlock... I use claude and gemini, claude mainly for coding, so wanted to set it up first. With gemini, ran into the same problem that you did...
    • jmuncor 6 hours ago
      Kind of yes... But with a nice cli so that you don't have to set it up just run "sherlock claude" and "sherlock start" on two terminals and everything that claude sends in that session then it will be stored. So no proxy set up or anything, just simple terminal commands. :)
  • elphard 4 hours ago
    This is fantastic. Claude doesn't make it easy to inspect what it's sending - which would actually be really useful for refining the project-specific prompts.
    • jmuncor 4 hours ago
      Love you like it!! Let me know any ideas to improve it... I was thining in the direction of a file system and protocol for the md files, or dynamic context building. But would love to hear what you think.
  • someguy101010 2 hours ago
    Does this support bedrock?
    • jmuncor 39 minutes ago
      Could add support if you need it! Just let me know :D
  • alickkk 6 hours ago
    Nice work! Do i need to update Claude Code config after start this proxy service?
    • jmuncor 6 hours ago
      Nope... You just run "sherlock claude" and that sets up the proxy for you. So you dont have to think about it... And just use claude normally, every prompt you send in that session will be stored in the files.
  • andrewstuart 5 hours ago
    What about SSL/certificates ?
    • jmuncor 4 hours ago
      I didn't understand the quesion I am sorry.
      • actionfromafar 4 hours ago
        I also assumed Claude Code would need some kind of cert nudging to accept a proxy.

        But it's in the README:

        Prompt you to install it in your system trust store

  • lifetimerubyist 2 hours ago
    lmao WTAF is this?

    build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/sherlock

    • jmuncor 1 hour ago
      That is what you would call vibe-ception... Hahahahah correcting it now! hahahahahahahaha!!