Description
Junior Data Engineer (Python/Web Scraping/Data Quality)
Were looking for a sharp, curious, and driven Junior Data Engineer to join our team at Forecasa, a U.S.-based data startup focused on delivering high-quality real estate data and analytics to lenders and investors. In this role, youll be part of our Data Acquisition & Quality team, helping us scale and improve the systems that collect, validate, and monitor the data that powers our platform. What Youll Do Develop and maintain Python-based web scrapers to collect structured and unstructured data from various sources. Use tools like Selenium, BeautifulSoup, and Pandas and Pyspark to extract and normalize data efficiently. Package scrapers as Docker containers and deploy them to Kubernetes. Create and manage Airflow DAGs to orchestrate and schedule scraping pipelines. Build data validation pipelines to catch anomalies, missing values, and data inconsistencies. Set up Grafana dashboards to monitor pipeline health and data quality metrics. Collaborate with senior engineers to continuously improve scraper reliability, performance, and coverage. Our Tech Stack Python PySpark Selenium Airflow Pandas Postgres S3 Docker Kubernetes GitLab Grafana What We're Looking For Solid experience in Python, especially in building web scrapers. Familiarity with libraries like Selenium, BeautifulSoup, or Scrapy. Some experience with Docker, Airflow, or other workflow orchestration tools. Basic understanding of data validation, data cleaning, and monitoring best practices. A resourceful, problem-solving mindset youre not afraid to dig into a messy site or debug a flaky scraper. Bonus Points For Experience working with Grafana or Prometheus for monitoring. Exposure to cloud platforms (AWS preferred) and managing scrapers at scale. Familiarity with CI/CD and Git workflows (we use GitLab). About Us Forecasa is a U.S.-based startup delivering enriched real estate transaction data to private lenders and investors. Were a small, fast-moving team with a strong engineering culture and a mission to bring clarity and transparency to a fragmented market. Location Remote we welcome candidates from anywhere in the world.
NOTE: Please make all e-mails and communications through the djinni website. Thank you.