newsletter aggregator proposal

February 6, 2025

requirements

  • ingestion
    • free and paid email newsletters
  • output
    • summarized content based on various criteria
  • pipeline
    • vary input
      • which newsletter sources
      • which newsletter content (e.g. specific issues about semiconductors)
      • other filters (e.g. topic/category/keywords, etc.)
    • vary output
      • which model
      • output length
      • output complexity
      • output structure
    • vary delivery
      • on demand
      • background on schedule
  • simple dashboard
    • kick off generations
    • manage content
    • control sending of newsletters
      • e.g. you might generate 10 outputs, but only want to send one

questions

  • is this meant to be fully automated? or is it more like, we gather newsletters as input, and we can run aggregations/generate newsletters from a dashboard?

  • do we want to be able to maintain multiple newsletter creations in parallel? (e.g. ai policy/semiconductor analysis/competitor analysis) [i presume yes in the above]

  • do we need to support citations? (i.e. to link back to specific points in the original content?)

  • do we want to support semantic search?

  • do we need to have an API that other code can call?

broad approach

  • ingestion

    • set up an email for inbound
    • use it to sign up to required newsletters
    • when new content comes in
      • parse, categorize, vectorize, store in db
  • pipeline

    • ability to set up steps in pipeline
      • defining input/output parameters
    • click to generate an output
    • click send, sends newsletter (or can do whatever we want with the output here)
  • simple dashboard

    • users can log in
    • see stored content
    • manage generations, with input/output parameters
    • see content, and send when needed
    • optional: set up schedules
  • tech stack

    • zapier for email ingestion webhook
    • nextjs app with trpc
    • hook into gemini flash/pro for various parts
    • prisma + neon for database
    • vercel for hosting (or gcp, doesn't really matter)
    • resend for sending simple newsletters

relevant work

  • we built a version of this back for Audyo back in 2020 without LLMs
  • built an app for another company which manages input/output generations in a pipeline in a similar way to what we are looking for here. quick loom demo here.

timeline & cost

  • 3 weeks for MVP
  • 1st milestone: approve basic designs + architectural diagram
  • 2nd milestone: launch + test
  • cost: £15k GBP, paid 50% upfront, 50% on launch

commentary

  • this is fairly straightforward to make an initial version of
  • assumes you can just pipe all free/paid newsletters to a central location
  • further tweaks could be a pain, but it depends on the exact functionality that is required