I completely ignored Anthropic’s advice and wrote a more elaborate test prompt based on a use case I’m familiar with and therefore can audit the agent’s code quality. In 2021, I wrote a script to scrape YouTube video metadata from videos on a given channel using YouTube’s Data API, but the API is poorly and counterintuitively documented and my Python scripts aren’t great. I subscribe to the SiIvagunner YouTube account which, as a part of the channel’s gimmick (musical swaps with different melodies than the ones expected), posts hundreds of videos per month with nondescript thumbnails and titles, making it nonobvious which videos are the best other than the view counts. The video metadata could be used to surface good videos I missed, so I had a fun idea to test Opus 4.5:
В России ответили на имитирующие высадку на Украине учения НАТО18:04
。关于这个话题,91视频提供了深入分析
«Я бы с удовольствием принял ядерное оружие от Великобритании и Франции, но такие предложения пока не поступали», — подчеркнул он.
如果说大模型市场的竞争是一场正面硬仗——几个巨头拼算力、拼数据、拼融资,你死我活——那生成式媒体的竞争,更像是一片丛林,没有主角,到处都是机会。