
What this tutorial covers
- Access and scrape website posts and contents using Browserbase
- Write scheduled functions and APIs with Val Town
- Send automated Slack messages via webhooks
Getting Started
In this tutorial, you’ll need aBrowserbase
Browserbase is a developer platform to run, manage, and monitor headless browsers at scale. We’ll utilize Browserbase to navigate and scrape different news sources. We’ll also use Browserbase’s Proxies to ensure we simulate authentic user interactions across multiple browser sessions. Sign up for free to get started!Val Town
Val Town is a platform to write and deploy JavaScript. We’ll use Val Town for three things.- Create HTTP scripts that run Browserbase sessions. These Browserbase sessions will execute web automation tasks, such as navigating Hacker News and Reddit.
- Write Cron Functions (like Cron Jobs, but more flexible) that periodically run our HTTP scripts.
- Store persistent data in the Val Town provided SQLite database. This built-in database allows us to track search results, so we only send Slack notifications for new, unrecorded keyword mentions.
Twitter (X)
For this tutorial, we’ll use the Twitter API to include Twitter post results.You’ll need to create a new Twitter account to use the API. It costs $100 /
month to have a Basic Twitter Developer account.
SLACK_WEBHOOK_URL
, BROWSERBASE_API_KEY
, and TWITTER_BEARER_TOKEN
, input all of these as Val Town Environment Variables.
Creating our APIs
We’ll use a similar method to create scripts to search and scrape Reddit, Hacker News, and Twitter. First, let’s start with Reddit. To create a new script, go to Val Town → New → HTTP Val. Our script will take in a keyword, and return all Reddit posts from the last day that include our keyword. For each Reddit post, we want the output to include the URL, date_published, and post title. For example:redditSearch
script, we start by importing Puppeteer and creating a Browserbase session with proxies enabled (enableProxy=true
). Be sure to get your BROWSERBASE_API_KEY
from your Browserbase settings.
- Navigate to Reddit and do a keyword search
- Scrape each resulting post
title
, date_published
, and url
.
convertRelativeDatetoString
, to convert dates to a uniform date format. We import this at the top of our redditSearch script.
hackerNewsSearch
, and use the Twitter API to create twitterSearch
.
See all three scripts here:
Reddit → redditSearch
Hacker News → hackerNewsSearch
Twitter → twitterSearch
Creating the Cron Function
For our last step, we create aslackScout
cron job that calls redditSearch
, hackerNewsSearch
, and twitterSearch
that runs every hour. To create the cron file, go to Val Town → New → Cron Val.
In our new slackScout file, let’s import our HTTP scripts.
-
createTable
: creates the new SQLite table -
isURLInTable
: for each new website returned, checks if the website is already in our table -
addWebsiteToTable
: ifisURLInTable
isFalse
, we add the new website to our table
slackScout
here.