Posts

Showing posts from July, 2025

Tool calling with Moonshot's Kimi K2 mode

Moonshot's new Kimi 2  model is extremely inexpensive to use. Moonshot AI (月之暗面) is a prominent Chinese artificial intelligence startup focused on developing large language models with the long-term goal of achieving Artificial General Intelligence (AGI). The company gained significant recognition for its Kimi chatbot, which pioneered an exceptionally large context window capable of processing up to two million Chinese characters in a single prompt. Backed by major investors like Alibaba, Moonshot AI has quickly become one of China's most valuable AI unicorns and a key competitor in the global AI race. The release of Kimi 2 last week has been referred to as "another DeepSeek moment." You need a Moonshot API access key from  https://platform.moonshot.ai/console/account The direct API pricing from Moonshot AI is approximately: Input Tokens:  ~$0.60 per 1 million tokens Output Tokens:  ~$2.50 per 1 million tokens Several inference providers in the USA are now supporting...

Google Gemini Batch Mode API with a 50% cost reduction: a game changer?

 I noticed on X this morning that Google dropped a new batch API with a 50% price cut. I use gemini-2.5-flash for speed and low cost and being able to batch large numbers of requests in a JSONL file (JSON where each line is a single legal JSON expression) seems like a big deal to me. Gemini Batch API Docs I have been a little negative on Hacker News and X recently about the energy costs vs. value from LLM use and it seems like Google is striking a good middle ground in cost and environmental impact. Automating NLP and other workflows seems fairly simple: automate writing pipeline requests to a JSONL file, automate sending requests, periodically polling for the results being complete, auto-download and use results in your workflows.

So much fun: recreating 1970s text adventure games using LLMs, but better

In the late 1970s, I worked long hours on a text based adventure game called land of the dwarf for the Apple II computer. My game was written in Apple Basic and I gave it away for free. I wrote it using a huge sheet of paper, drawing a transition network diagram with bubbles, locations and action codes for things that could be done and arcs between the bubbles being the ability to move from one location to another. Yesterday I was really in the mood to do something fun. We didn’t have anything going on for Fourth of July celebrations until an outdoor symphony in the early afternoon, so I sat down in the morning with Google‘s excellent Gemini CLI coding agent and I described what I wanted, which was to be able to input a story context as a text input file and then use an LLM in a conversation loop and continuously provide you with the text based adventure program experience. I was really surprised how well it turned out. Fun! The generated adventure game code uses a local model running ...