Playwright on Elastic Beanstalk
2022-12-30
Est. 2m read
I’ve spent a lot of time with web scrapers over the years. BeautifulSoup was my first love. And then it was Puppeteer. But the modern approach is to use Playwright. It’s an incredible tool for all kinds of browser automation and I recommend starting with it.
I appreciate how intuitive the API is and it typically just works for all of my JavaScript needs.
python-3.7 on Amazon Linux 2
Today, I tried setting up Playwright in AWS Elastic Beanstalk. While using the
python-3.7
platform I was unable to execute playwright install
via SSH:
$ playwright install
ERROR: cannot install on amzn distribution - only Ubuntu is supported
Lucky for me, I had come across a similar issue at work. We wanted playwright
to be setup on our self-hosted runners for GitHub Actions without having to
playwright install
each time a new runner is setup.
The solution was to use a Docker image with playwright pre-installed. We used mcr.microsoft.com/playwright:v1.27.0-focal as the container for each of the steps. See a full example here.
Ubuntu Dockerfile
A simple solution is to recreate the EB environment using the docker
platform.
We’ll no longer use the python-3.7
platform. That also means the Procfile
is no
longer needed.
$ eb init --platform docker --region us-east-1 my-app-name
$ eb create \
--platform docker \
--elb-type application \
--region us-east-1 \
-k my-app-keys
From there, I created a Dockerfile
with the following contents. You may
need to customize this container for your project. I’m using Python + FastAPI.
- Note: You may need to replace
main:app
with your own entrypoint.
FROM ubuntu:20.04
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y python3.9 python3.9-dev python3-pip
RUN pip install gunicorn
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
# install playwright
RUN playwright install --with-deps
COPY . .
EXPOSE 8000
# the -b 0.0.0.0:8000 is required for the load balancer
# to communicate with the container
CMD ["gunicorn", "main:app", "-b", "0.0.0.0:8000", "--worker-class", "uvicorn.workers.UvicornWorker", "--workers", "1"]
That’s all you really need to get playwright configured in your EB environment.
Deploying
Updating your app with your local files is as you’d expect…
$ eb deploy
Closing Notes
- Docker images are cleaned up after each deployment.
- You’ll probably want to update the SSL certificate in the Listeners
section of your load balancer after it’s created.
- To use SSL, update the newly created security group to allow inbound/outbound HTTPS traffic.
- Use the
eb ssh
command to connect to your EB environment via SSH (assuming you specified-k
ineb create
.) - Use the
eb logs
command to view the logs for your EB environment.